Serializing chinese characters with Xerces 2.6

Posted by Gianluca on Stack Overflow See other posts from Stack Overflow or by Gianluca
Published on 2010-06-08T09:21:02Z Indexed on 2010/06/08 10:12 UTC
Read the original article Hit count: 380

Filed under:
|
|

I have a Xerces (2.6) DOMNode object encoded UTF-8. I use to read its TEXT element like this:

CBuffer DomNodeExtended::getText( const DOMNode* node ) const {
  char* p = XMLString::transcode( node->getNodeValue( ) );
  CBuffer xNodeText( p );
  delete p;
  return xNodeText;
}

Where CBuffer is, well, just a buffer object which is lately persisted as it is in a DB.

This works until in the TEXT there are just common ASCII characters. If we have i.e. chinese ones they get lost in the transcode operation.

I've googled a lot seeking for a solution. It looks like with Xerces 3, the DOMWriter class should solve the problem. With Xerces 2.6 I'm trying the XMLTranscoder, but no success yet. Could anybody help?

© Stack Overflow or respective owner

Related posts about c++

Related posts about character-encoding