Office Automation: What does destroy my encoding?
- by Filburt
I'm facing a problem with a Word Mail Merge Automation controlled by our CRM system.
The setup
Base for the Mail Merge is a Word .dot template which fires a macro on Document.New. Inside this macro I create a .Net component registered for COM.
Set myCOMObject = CreateObject("MyCOMObject")
The component pulls some data from a database and hands string values which are assigned to Word DocumentVariables.
Set someClass = myCOMObject.GetSomeClass(123)
ActiveDocument.Variables("docaddress") = someClass.GetSenderAddress(456)
All string values returned from the component are encoded in UTF-8 (codepage 1200).
What happens
The problem arises when the CRM system calls Word to perform the Mail Merge: The string values from the component are turned into UTF-8 encoded strings.
All the static text inside the template and the data pulled for the Mail Merge stay nicely encoded in UTF-16 - example the umlaut ü inside my DocumentVariables is turned into c3 b0 while it stays fc for the rest of the document (checked file in hex editor).
If I'm creating a document from a template with the same macro functionallity but without performing a Mail Merge all strings are fine; i.e. are encoded in UTF-16.
What changed
According to the CRM software vendor the encoding of the Mail Merge data export was changed to UTF-16 with the new version we're currently testing.
I found out that MS states that you'll expirience issues when the document and the Mail Merge data file encoding don't match.
What I tried
Since I'm assuming to merge with UTF-16 encoded data I added the following lines to my macro:
ActiveDocument.TextEncoding = msoEncodingWestern
ActiveDocument.SaveEncoding = msoEncodingUnicodeLittleEndian
This is what the Mail Merge data document specifies in its document properties.