Retrieve the content of Microsoft Word document using OpenXml and C#
Posted
by ybbest
on YBBest
See other posts from YBBest
or by ybbest
Published on Wed, 29 Jun 2011 11:20:04 +0000
Indexed on
2011/06/29
16:32 UTC
Read the original article
Hit count: 672
One of the tasks involves me to retrieve the contents of Microsoft Word document (word2007 above). I try to search for some resources online with not much luck; most of the examples are for writing contents to word document using OpenXml. I decide to blog this as my reference and hopefully people who read this post will find it useful as well.
To retrieve the contents of Microsoft Word document using XML is extremely simple.
1. Firstly, you need to download and install the Open XML SDK 2.0 for Microsoft Office. (Download link)
2. Create a Console application then add the DocumentFormat.OpenXml.dll and WindowsBase.dll to the project, you can find these dlls in the .NET tab of the Add Reference window.
3. Write the following code to grab the contents from the word document and display it on the console window.
You can download the complete source code here.
References:
Getting Started with the Open XML SDK 2.0 for Microsoft Office
Walkthrough: Word 2007 XML Format
Open XML SDK 2.0 for Microsoft Office
© YBBest or respective owner