docx - Developer IT

Identifying .doc/.docx files that contain images

- by rev

I'm moving my notes to evernote. To this end I need to convert .doc/.docx files to rtf. The reason for this is that I have a script to import rtf into evernote. However, some of my .doc/.docx files contain images. Is there any way to identify which .doc/.docx files contain images without viewing them all? I have thousands. This way I can simply open the few that have images and copy/paste the entire content straight into evernote. Should say that I'm using OS X 10.6.8.

Read the article

pdf creation software, for docx and odt

- by oxinabox.ucc.asn.au

Ok, I have a fairly large collection of docx and odt files. Minutes from meetings etc. Now I want to convert them to pdfs for distrobution. and also into one combined pdf. At the momement I'm using Adobe Acrobat 8 (Pro iirc). and on another machine I'm using foxit pdf printer. To do this I have to print them each individually to pdfs. and then I can combine them with Acrobat, cos acrobat doesn't support conversion stright from docx or odt to pdf - only via printing. Now this is annoying if you have to do it on a regular basis, since i don't keep the pdfs around (I have the orignals source controlled :-D) cos they go out of date pretty quick as I often have to go back and modify old versions (like ridiculously often). e Eg When I find out I've got something in the minutes wrong or I want to add more context for clarifaction. Anyone got a better solution?

Read the article

Internet Explorer wont open docx files, saves them as zip

- by David Gard

I have several docx documents on an Intranet for my work, but IE8 refuses to open them, instaed only saving them as a zip (filename_docx.zip). This seems to be only an IE8 problem (surprise, surprise!) as both FF and Chrome open the documents just fine. Unfortunately as this is work based, I cannot simply drop IE as I would, in favour of a decent browser. Does anybody know how to fix this issue in IE? Thanks.

Read the article

PHP - Opening uploaded DOCX files with the correct MIME TYPE

- by user270797

I have users uploading DOCX files which I make available for download. The issues we have been experiencing is the unknown mime types of DOCX files which causes IE to open these docs as Zip files. It is running on a Windows/IIS server. Because this is a shared host, I cannot change any server settings. I was thinking that I could just write some code that would handle DOCX files, perhaps custom output: if (extension=docx) { header("Content-Disposition: attachment; etc) header('Content-Type: application/application/vnd.openxmlformats-officedocument.wordprocessingml.document'); Output the file contents etc } Would this be a viable solution?? If so, can someone help fill in the gaps? (PS I know the above syntax is not correct, just a quick example)

Read the article

Determine if the document is DOC or DOCX in Java app without knowing its extension

- by Andriy

There is a constraint in the content management system that requires to store all word documents with specific extension (different from DOC or DOCX). However, when outputting the document to user we need to know if it is a DOC or DOCX file in order to provide the right MIME type. So, is there a way to programatically find out if document is DOC or DOCX by its content?

Read the article

Hi, I want to create new DOCX file by reading DOCX template (it's content is already replaced)

- by ruwan Kumara

Up to now code is read the template and replace with new values and finally replace the docx file with new values. Can any one please tell me how to save the replaced docx file in diffrent name?. My code is bellow. using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(document, true)) { string docText = null; using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream())) { docText = sr.ReadToEnd(); } Regex regexText = new Regex("#ApplicationCompleteDate#"); docText = regexText.Replace(docText,DataHolding.ApplicationCompleteDate); regexText = new Regex("#ApplicantPrivateAddress#"); docText = regexText.Replace (docText,UserDataHolding.ApplicantPrivateAddress); using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream (FileMode.Create))) { sw.Write(docText); } } If any one help me with this creating new docx file by changing above code, it will be very helpful for me. thank You.

Read the article

Is it possible to output other formats than .docx and .odt with TinyButStrong and OpenTPS plugin

- by Corum

I have a module which merge document from database records and .docx or .odt document model. I have to output .docx, .odt or .pdf. For outputing MS and Open format, there is no problem, all work properly. But what I want to know is, if I can output something (like XML or HTML) which I can use after to build a PDF document? If I can't, are there any libraries which provide merge document like : DOCX (or ODT) + database record => PDF And I don't want use phplivedocx.

Read the article

Asp.net library to extract plain text from docx, pptx, xlsx (for search index)

- by Myster

Is there a pre-existing library to extract plain text form docx, pptx, and xlsx files? I require this to populate a lucene.net index. I've found this example which extracts text from docx and it seems to work ok. But before building my own solution based on this I was wondering if there's something already available for the other file formats?

Read the article

Getting document.xml from a docx file using ZipInputStream

- by meenakshik

Hello, I have a inputStream of a docx file and I need to get hold of the document.xml which lies inside the docx. I am using ZipInputStream to read my stream and my code is something like ZipInputStream docXFile = new ZipInputStream(fileName); ZipEntry zipEntry; while ((zipEntry = docXFile.getNextEntry()) != null) { if(zipEntry.getName().equals("word/document.xml")) { System.out.println(" --> zip Entry is "+zipEntry.getName()); } } As you can see The output for zipEntry.getName comes as "word/document.xml" at some point. I need to pass this document.xml as a stream and unlike the ZipFile method where you can easily pass this on calling .getInputStream, I am wondering how can I do this docXFile? Thanks in advance, Meenakshi

Read the article

Convert doc/docx to semantic HTML

- by sandstrom

I would like to convert doc/docx documents to semantic HTML. Some wishes/requirements: Semantic HTML such that headers in the document are <h1>, <h2> etc., tables are <table> and so forth. Should preferably be possible to handle headings, lists, tables and images. Graphs and math formulas is a nice extra. • Doesn't have to be converted straight from doc/docx to html, could use an intermediary format, such as xml or docbook. • Should work programatically, and with large number of documents. The closest thing to a solution I've found so far is http://holloway.co.nz/docvert/index.html, but unfortunately there are many a few bugs, small user base and it can't handle a lot of documents. More of a proof of concept.

Read the article

Writing a docx file to a BLOB using Java 1.4 inside Oracle 10g

- by Jon Renaut

I'm trying to generate a blank docx file using Java, add some text, then write it to a BLOB that I can return to our document processing engine (a custom mess of PL/SQL and Java). I have to use the 1.4 JVM inside Oracle 10g, so no Java 1.5 stuff. I don't have a problem writing the docx to a file on my local machine, but when I try to write to BLOB, I'm getting garbage. Am I doing something dumb? Any help is appreciated. Note in the code below, all the get[name]Xml() methods return an org.w3c.dom.Document. public void save(String fileName) throws Exception { ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(fileName)); addEntry(zos, getDocumentXml(), "word/document.xml"); addEntry(zos, getContentTypesXml(), "[Content_Types].xml"); addEntry(zos, getRelsXml(), "_rels/.rels"); zos.flush(); zos.close(); } public java.sql.BLOB save() throws Exception { java.sql.Connection conn = DbUtilities.openConnection(); BLOB outBlob = BLOB.createTemporary(conn, true, BLOB.DURATION_SESSION); outBlob.open(BLOB.MODE_READWRITE); ZipOutputStream zos = new ZipOutputStream(outBlob.setBinaryStream(0L)); addEntry(zos, getDocumentXml(), "word/document.xml"); addEntry(zos, getContentTypesXml(), "[Content_Types].xml"); addEntry(zos, getRelsXml(), "_rels/.rels"); zos.flush(); zos.close(); return outBlob; } private void addEntry(ZipOutputStream zos, Document doc, String fileName) throws Exception { Transformer t = TransformerFactory.newInstance().newTransformer(); ByteArrayOutputStream baos = new ByteArrayOutputStream(); t.transform(new DOMSource(doc), new StreamResult(baos)); ZipEntry ze = new ZipEntry(fileName); byte[] data = baos.toByteArray(); ze.setSize(data.length); zos.putNextEntry(ze); zos.write(data); zos.flush(); zos.closeEntry(); }

Read the article

Manipulate Docx with C# without Microsoft Word installed with Open XML SDK

Use C# and Open XML SDK to manipulate docx without mso

Read the article

Oracle Tutor: * CAUTION to Word .docx Users *

- by [email protected]

Microsoft released a security update KB969604 for Office 2007 (around June 2009) This update causes document variables within Word docx files to be scrambled. This update might still be pushed out via Office 2007 updates DO NOT save files as docx using MS OFFICE 2007 until you apply the MS hotfix # 970942 available here If you are using Windows XP with Office 2003 or Office 2000 and have installed an older Office 2007 compatibility pack, documents saved as docx may also cause the scrambled document variables. Installing the 2007 compatibility pack published on 1/6/2010 (version 4) will prevent the document variables from becoming corrupt. Those on Windows 2000 may not be able to install the latest compatibility pack, or the compatibility pack may not function properly. This situation will hopefully be rectified in the coming months. What is a document variable? Document variables store data inside the document, invisible to the user. The Tutor software uses them when converting the document to HTML and when creating the flowchart, just to name a couple of uses. How will you know if a document's variables are scrambled? The difficulty in diagnosing the issue is that the symptoms can take myriad forms. There isn't a single error message or a single feature that one can point to and say, "test for the problem by doing this." The best clue about the error is seeing any kind of string in an error message that has garbage characters, question marks, xml code snippets, or just nonsense. Such as "Language ?????????????xlr;lwlerkjl could not be found." It is also possible to see the corrupted data in the footers of the Word docs. And, just because the footers look correct does not mean that the document variables are not corrupted. The corruption problem does not occur in every document variable in the document, just some of them. Often it is less than a quarter of them. What is the difference between docx files and doc files? Office 2007 uses Office Open XML formats with .docx and .docm filename extensions. - Docx is an Office Open XML word document. - Docm is a macro enabled Office Open XML document. This means the file structure behind the scenes is quite different from the binary file formats used prior to Office 2007 such as .doc, .dot, .xls, and .ppt. Solution Summary: For Windows XP and Word 2007: Install the hotfix, or save files as *.doc For Windows XP and Word 2000 and 2003: Install the latest compatibility pack or save files as *.doc For Windows 2000 with Word 2000 or 2003, do not use any compatibility pack, save files as *.doc Emily Chorba Principle Product Manager for Oracle Tutor

Read the article

Problem uploading .docx through html form

- by Mikael

I've made a simple form, with the proper enctype for uploading files. When i try to upload a .docx everything works fine in IE 8 and Safari, but in Firefox or IE 7 or 6 i can't even click submit, nothing happens! Could this still be a server issue? It's an apache server. Everything works fine if i choose to upload a .doc file <form enctype="multipart/form-data" method="post" action="index.php"> <input name="file" type="file" /> <input type="submit" name="btnSubmit" value="Submit"/> </form>

Read the article

Does anyone know of a way to easily convert a PDF to a docx format programmatically

- by Rob

We have a couple 3rd party systems that give us PDFs. We would like to convert those PDFs for display on the web without using an Adobe product. Ideally we would like to use Silverlight to render the PDFs but are having trouble converting from a PDF to Xaml or using docx format as a middle man. There are lots of libraries that give PDFs but that is not what we need. If there is a library out there that does this, a .net lib would be preferable but we can run the conversion using the command line as well if that is an option.

Read the article

How can I use predefined formats in DOCX with POI?

- by furtelwart

I'm creating a docx generator with POI and would like to use predefined formats. Word includes several formats like Title, Heading 1..10 etc. These formats are predefined in every DOCX you create with Word. I would like to use them in my docx generator. I tried the following but the format was not applied: paragraph = document.createParagraph(); lastParagraph.setStyle("Heading1"); I also tried "heading 1", "heading1" and "Heading1" as style, but none of them worked. The API documentation doesn't show any details. I analysed a docx file created with Word 2007 and found out "Heading1" would be correct. Unfortunately, the style is not defined in the docx. Do I have to create this style manually? Can anyone point me to the correct solution?

Read the article

What does X denotes in ASPX, DOCX, XLSX, PPTX etc?

- by Manish

Previously there were ASP, DOC, XLS, PPT etc and now ASPX, DOCX, XLSX, PPTX respectively. What does X denotes in ASPX, DOCX, XLSX, PPTX etc?

Read the article

Convert Docx or Odt to Pdf

- by luxifer

Hi there, I need to find a way to convert docx or odt files to pdf on a linux web server. Therefore I'm not willing to install openoffice.org for obvious reasons. I've tried Google but it failed for me, so I'm here :-) I can't imagine there's no other solution to this problem than to install a huge chunk of binaries given that a) there are (or at least should be) lot's of packages which can read docx or at least odt and b) there are as many packages which can write pdf files What am I missing here? scratching head Regards, luxifer ps edit: I don't want to use a web service - neither free or paid edit 2: at this point it would also help to convert the docx back to doc so I could use wvpdf to generate the pdf... edit 3: of course it would also help if i could do search and replace on a doc file in the first place; or xps for that matter

Read the article

Converting Creole to HTML, PDF, DOCX, ..

- by Marko Apfel

Challenge We documented a project on Github with the Wiki there. For most articles we used Creole as markup language. Now we have to deliver a lot of the content to our client in an usual format like PDF or DOCX. So we need a automatism to extract all relevant content, merge it together and convert the stuff to a new format. Problem One of the most popular toolsets to convert between several formats is Pandoc. But unfortunally Pandoc does not support Creole (see the converting matrix). Approach So we need an intermediate step: Converting from Creole to a supported Pandoc format. Creolo/c is a Creole to Html converter and does exactly what we need. After converting our Creole content to Html we could use Pandoc for all the subsequent tasks. Solution Getting the Creole stuff First at all we need the Creole content on our locale machines. This is easy. Because the Github Wiki themselves is a Git repository we could clone it to our machine. In the working copy we see now all the files and the suffix gives us the hint for the markup language. Converting and Merging Creole content to Html Because we would like all content from several Creole files in one HTML file, we have to convert and merge all the input files to one output file. Creole/c has an option (-b) to generate only the Html-stuff below a Html <Body>-tag. And this is hook for us to start. We have to create manually the additional preluding Html-tags (<html>, <head>, ..), then we merge all needed Creole content to our output file and last we add the closing tags. This could be done straightforward with a little bit old DOS magic: REM === Generate the intro tags === ECHO ^<html^> > %TMP%\output.html ECHO ^<head^> >> %TMP%\output.html ECHO ^<meta name="generator" content="creole/c"^> >> %TMP%\output.html ECHO ^</head^> >> %TMP%\output.html ECHO ^<body^> >> %TMP%\output.html REM === Mix in all interesting Creole stuff with creole/c === .\Creole-C\bin\creole.exe -b .\..\datamodel+overview.creole >> %TMP%\output.html .\Creole-C\bin\creole.exe -b .\..\datamodel+domain+CvdCaptureMode.creole >> %TMP%\output.html .\Creole-C\bin\creole.exe -b .\..\datamodel+domain+CvdDamageReducingActivity.creole >> %TMP%\output.html .\Creole-C\bin\creole.exe -b .\..\datamodel+lookup+IncidentDamageCodes.creole >> %TMP%\output.html .\Creole-C\bin\creole.exe -b .\..\datamodel+table+Attachments.creole >> %TMP%\output.html .\Creole-C\bin\creole.exe -b .\..\datamodel+table+TrafficLights.creole >> %TMP%\output.html REM === Generate the outro tags === ECHO ^</body^> >> %TMP%\output.html ECHO ^</html^> >> %TMP%\output.html REM === Convert the Html file to Docx with Pandoc === .\Pandoc\bin\pandoc.exe -o .\Database-Schema.docx %TMP%\output.html Some explanation for this The first ECHO call creates the file. Therefore the beginning <html> tag is send via > to a temporary working file. All following calls add content to the existing file via >>. The tag-characters < and > must be escaped. This is done by the caret sign (^). We use a file in the default temporary folder (%TMP%) to avoid writing in our current folders. (better for continuous integration) Both toolsets (Creole/c and Pandoc) are copied to a versioned tools folder in the Wiki. This is committable and no problem after pushing – Github does not do anything with it. In this folder is also the batch (Export-Docx.bat) for all the steps. Pandoc recognizes the conversion by the suffixes of the file names. So it is enough to specify only the input and output files.

Read the article

Fill Mergefields in .docx Documents without Microsoft Word

Utility class for filling mergefields (loose fields and tabular data) in a Microsoft Word (docx) template document, without needing Microsoft Word itself

Read the article

Is there any java library (maybe poi?) which allows to merge docx files?

- by Roman

I need to write a java application which can merge docx files. Any suggestions?

Read the article

Configuring Full-Text Search for pdf and docx files

- by Lukasz Kurylo

I think in may I was creating a little filters module based on Full Text-Search. I have configured my dev machine, the same for two testing servers – in our company for internal testing before we deployed it to client, and then on the testing client server. Until last week this build was still on the testing server and finally we got feedback that we can deploy it on the production one. I only say that, I lost half a day because I had not correctly remembered what I was doing to configure the FTS on the previous servers and I had no notes for that. I foolishly believed in my memory. Lesson learned. For future reference a bunch of steps to configure the FTS for searching in *.pdf and *.docx files (and by the way in other Office files like *.xlsx). 1. From the page (link) download and install the *.pdf IFilter for FTS. 2. To the PATH global system variable add path to the catalog, where you installed the plugin. Default for this version is: C:\Program Files\Adobe\Adobe PDF iFilter 9 for 64-bit platforms\bin 3. From the page (link) download a FilterPackx64.exe and install it. 4. Now from SSMS execute the following procedures: -sp_fulltext_service 'load_os_resources',1 -sp_fulltext_service 'verify_signature', 0 5. Restart the server 6. Now we must check if the plugins are visible: -select document_type, path from sys.fulltext_document_types where document_type = '.pdf' -select document_type, path from sys.fulltext_document_types where document_type = '.docx' 7. If we see a result, then we can assume that everything is ok*. 8. Right now we can create a catalog for FTS and indexes on appropriate columns. *I lost a lot of hours to find out, why the plugin for the *.pdf files wasn’t indexed any file in the database, but in the sys.fulltext_document_types table there was available a line for this plugin. After the deeper investigation I found that the *.pdf files actually were indexed. At least the EOF sign was added to the indexes and nothing more for each file. In the end the problem was that, I forgot to add the /bin in the path to the plugin in PATH variable..

Read the article

How can I programmatically convert Word doc or docx files into text files?

- by CheeseConQueso

I need a way to convert .doc or .docx extensions to .txt without installing anything. I also don't want to have to manually open Word to do this obviously. As long as it's running on auto. I was thinking that either Perl or VBA could do the trick, but I can't find anything online for either. Any suggestions?

Read the article

Java POI 3.6 XWPF usage guidelines (reading content of docx file)

- by Mr CooL

I assume the following objects should be used to read contents of DOCX file: XWPFDocument XWPFWordExtractor However, somewhere the compiler warns me from not including the correct libraries needed in classpath. I think I'm kinda lost for not knowing which jar file is the right one to include for this since there are so many jar files (POI libraries). My project so far involve in reading doc and docx files as part of the project. I've managed to read the contents of doc file. However, for docx file, I'm still having problem with that. Can anyone show the guidelines in terms of the codes and libraries needed (jar files) to read the content of docx file? I'm trying to limit the libraries need to be added on into project since I need to read doc and docx only. The following works for doc: fs = new POIFSFileSystem(new FileInputStream(fileName)); HWPFDocument doc = new HWPFDocument(fs); WordExtractor we = new WordExtractor(doc); String[] p = we.getParagraphText();

Read the article

Converting DOCX files to PDF via SSH without losing formatting

- by Reado

I'm struggling to find a solution that will allow me to convert a DOCX file to a PDF without losing or malforming the formatting of the document on CentOS 5.7. I have tried CUPS-PDF but it doesn't work; spool files appear in the /var/spool folder but nothing happens after that. OpenOffice and LibreOffice converted a DOCX to PDF but the formatting was all wrong. However if I print the DOCX to a Windows PDF printer from my Windows 7 workstation, it outputs to PDF absolutely fine. So why can't Linux do the same? I tried to print via CUPS to the Windows PDF printer (shared) but the document appears in the queue as "Remote Downlevel Document" and doesn't print. This only happens when I print from Linux.

Search Results

Search found 260 results on 11 pages for 'docx'.

Page 1/11 | 1 2 3 4 5 6 7 8 9 10 11 | Next Page >

- by rev

- by oxinabox.ucc.asn.au

- by David Gard

- by user270797

- by Andriy

- by ruwan Kumara

- by Corum

- by Myster

- by meenakshik

- by sandstrom

- by Jon Renaut

- by [email protected]

- by Mikael

- by Rob

- by furtelwart

- by Manish

- by luxifer

- by Marko Apfel

- by Roman

- by Lukasz Kurylo

- by CheeseConQueso

- by Mr CooL

- by Reado

1 2 3 4 5 6 7 8 9 10 11 | Next Page >