pdfmark for docinfo metadata in pdf is not accepting accented characters in Keywords or Subject

Posted by rpilkey on Stack Overflow See other posts from Stack Overflow or by rpilkey
Published on 2010-06-09T21:17:41Z Indexed on 2010/06/09 21:22 UTC
Read the original article Hit count: 311

Filed under:
|

I am inserting metadata into postscript files with a program, to be distilled to pdf with Adobe Distiller. I am using this code that I grabbed from Thomas Merz's "Web Publishing with Acrobat-PDF":

/pdfmark where {pop} {userdict /pdfmark /cleartomark load put} ifelse

[ /Title (mot accenté)

  /Author (mot accenté)

  /Subject (mot accenté)

  /Keywords (mot accenté)

/DOCINFO pdfmark

When you look at the metadata, the accented characters turn into "?" in the Subject and Keyword fields, but not the Title and Author fields. The characters are the same ascii 233

I tried replacing them with octal encoding (\351), which came out the same (Title and Author okay, Subject and Keywords messed up).

file encoding is latin-1,unix eol

I found a mention on adobe forums, but the answer didn't make sense to me.

http://forums.adobe.com/message/1165593

I changed the encoding to utf-8, inserted the characters binarily (in VIM : <Ctrl-v>u00e9), no change. I tried inserting the BOM in a few places, it didn't work.

This is with Acrobat Pro 9

I didn't notice this problem with Acrobat Pro 7.

Does anybody know of a workaround to get the accented characters into ALL the metadata fields when modifying a postscript file, or tell me if I'm doing it wrong?

It seems weird that different fields would not accept the same bytes.

© Stack Overflow or respective owner

Related posts about pdf

Related posts about postscript