Lessons learned from Word 2007 automation with c# 2008

Posted by robertphyatt on Geeks with Blogs See other posts from Geeks with Blogs or by robertphyatt
Published on Thu, 18 Mar 2010 12:20:40 GMT Indexed on 2010/03/18 19:41 UTC
Read the original article Hit count: 446

Filed under:

My organization has an ongoing project to take documents produced for internal regulations and such, change some of the formatting and then export it as PDF.

Our requirements were that only one person would be doing this, but it has been painfully tedious and sometimes error-prone to do by hand. Enter the fearless developer to automate the situation!

Since I am one of those guys that just plain does not like VB, I wanted to do the automation in the ever-so-much-more-familiar C#. While Microsoft had made a dll that makes such a task easier, documentation on MSDN is pretty lame and most of the forumns and posts on the internet had little to do with my task.

So, I feel like I can give back to the community and make a post here of the things I have learned so far. I hope this is helpful to whoever stumbles upon it.

Steps to do this:

1) First of all, make some sort of a project and use some sort of a means to get the filename of the word document you are trying to open. I got the filename the user wanted with an openFileDialog tied to a button that I labeled 'Browse':

       private void btnBrowse_Click(object sender, EventArgs e)
       {
           try
           {
               DialogResult myResult = openFileDialog1.ShowDialog();
               if (myResult.Equals(DialogResult.
OK))
               {
                   if (openFileDialog1.SafeFileName.
EndsWith(".doc"))
                   {
                       txtFileName.Text = openFileDialog1.SafeFileName;
                       paramSourceDocPath = openFileDialog1.FileName;
                       paramExportFilePath = openFileDialog1.FileName.
Replace(".doc", ".pdf");
                   }
                   else
                   {
                       txtFileName.Text = "only something that end with .doc, please";
                   }
               }
           }
           catch (Exception err)
           {
               lblError.Text = err.Message;
           }
       }

 

2) Add in "using Microsoft.Office.Interop.Word;" after setting your project to reference Microsoft.Office.Core and Microsoft.Office.Interop.Word so that you don't have to add "Microsoft.Office.Interop.Word" to the front of everything.

3) Now you are ready to play. You will need to have a copy of word open and a copy of your word document that you want to modify open to be able to make the changes that are needed.

The word interop dll likes using ref on all the parameters passed in, and likes to have them as objects. If you don't want to specify the parameter, you have to give it a "Type.Missing". I suggest creating some objects that you reuse all over the place to maintain sanity.

object paramMissing = Type.Missing;

ApplicationClass wordApplication = new ApplicationClass();

Document wordDocument = wordApplication.Documents.Open(
               ref paramSourceDocPath, ref paramMissing, ref paramMissing,
               ref paramMissing, ref paramMissing, ref paramMissing,
               ref paramMissing, ref paramMissing, ref paramMissing,
               ref paramMissing, ref paramMissing, ref paramMissing,
               ref paramMissing, ref paramMissing, ref paramMissing,
               ref paramMissing);

4) There are many ways to modify the text of the inside of the word document. One of the ways that was most effective for me was to break it down by paragraph and then do things on each paragraph by what style the particular paragraph had.

           foreach (Paragraph thisParagraph in wordDocument.Content.Paragraphs)

           {
               string strStyleName = ((Style)thisParagraph.get_
Style()).NameLocal;
               string strText = thisParagraph.Range.Text;

               //Do whatever you need to do
           }

5) Sometimes you want to insert a new line character somewhere in the text or insert text into the document, etc.  There are a few ways you can do this: you can either modify the text of a paragraph by doing something like this ('\r' makes a new paragraph, '\v' will make a newline without making a new paragraph. If you remove a '\r' from the text, it will eliminate the paragraph you removed it from):

thisParagraph.Range.Text = "A\vNew Paragraph!\r" + thisParagraph.Range.Text;

OR

you could select where you want to insert it and have it act like you were typing in Word like any normal user (note: if you do not collapse the range first, you will overwrite the thing you got the range from)

object oCollapseDirectionEnd = WdCollapseDirection.wdCollapseEnd;
object oCollapseDirectionStart = WdCollapseDirection.
wdCollapseStart;

Range rangeInsertAtBeginning = thisParagraph.Range;

Range rangeInsertAtEnd = thisParagraph.Range;

rangeInsertAtBeginning.Collapse(ref oCollapseDirectionStart);

rangeInsertAtEnd.Collapse(ref oCollapseDirectionEnd);

rangeInsertAtBeginning.Select();

wordApplication.Selection.TypeText("Blah Blah Blah");

rangeInsertAtEnd.Select();

wordApplication.Selection.TypeParagraph();

6) If you want to make text columns, like a newspaper or newsletter, you have to modify the page layout of the document or a section of the document to make it happen. In my case, I only wanted a particular section to have that, and I wanted to have a black line before and after the newspaper-like text columns. First you need to do a section break on either side of what you wanted, then you take the section and modify the page layout. Then you can modify the borders of the section (or another object in the word document). I also show here how to modify the alignment of a paragraph.

           object oSectionBreak = WdBreakType.wdSectionBreakContinuous;

           //These ranges were set while I was going through the paragraphs of my document, like I was showing earlier

           rangeHeaderStart.InsertBreak(ref oSectionBreak);
           rangeHeaderEnd.InsertBreak(ref oSectionBreak);

           //change the alignment to justify
           object oRangeHeaderStart = rangeStartJustifiedAlignment.
Start;
           object oRangeHeaderEnd = rangeHeaderEnd.End;
           Range rangeHeader = wordDocument.Range(ref oRangeHeaderStart, ref oRangeHeaderEnd);
           rangeHeader.Paragraphs.
Alignment = WdParagraphAlignment.wdAlignParagraphJustify;

           //find the section break and make it into triple text columns
           foreach (Section mySection in wordDocument.Sections)
           {
               if (mySection.Range.Start == rangeHeaderStart.Start)
               {
                   mySection.PageSetup.
TextColumns.Add(ref paramMissing, ref paramMissing, ref paramMissing);
                   mySection.PageSetup.
TextColumns.Add(ref paramMissing, ref paramMissing, ref paramMissing);

                   //I didn't like the default spacing and column widths. This is how I adjusted them.
                   foreach (TextColumn txtc in mySection.PageSetup.
TextColumns)
                   {
                       try
                       {
                           txtc.SpaceAfter = 151.6f;
                           txtc.Width = 7;
                       }
                       catch (Exception)
                       {
                           txtc.Width =
151.6f;
                       }
                   }
               }
           }

That is all  I have time for today! I hope this was helpful to someone!


 

© Geeks with Blogs or respective owner