Java POI 3.6 XWPF usage guidelines (reading content of docx file)

Posted by Mr CooL on Stack Overflow See other posts from Stack Overflow or by Mr CooL
Published on 2010-03-09T05:19:42Z Indexed on 2010/03/09 5:36 UTC
Read the original article Hit count: 1214

Filed under:
|
|

I assume the following objects should be used to read contents of DOCX file:

XWPFDocument
XWPFWordExtractor

However, somewhere the compiler warns me from not including the correct libraries needed in classpath. I think I'm kinda lost for not knowing which jar file is the right one to include for this since there are so many jar files (POI libraries).

My project so far involve in reading doc and docx files as part of the project.

I've managed to read the contents of doc file. However, for docx file, I'm still having problem with that. Can anyone show the guidelines in terms of the codes and libraries needed (jar files) to read the content of docx file?

I'm trying to limit the libraries need to be added on into project since I need to read doc and docx only.

The following works for doc:

fs = new POIFSFileSystem(new FileInputStream(fileName)); 
HWPFDocument doc = new HWPFDocument(fs);
WordExtractor we = new WordExtractor(doc);
String[] p = we.getParagraphText();

© Stack Overflow or respective owner

Related posts about java

Related posts about excel