I'm working on a level editing tool that saves its data as XML.
This is ideal during development, as it's painless to make small changes to the data format, and it works nicely with tree-like data.
The downside, though, is that the XML files are rather bloated, mostly due to duplication of tag and attribute names. Also due to numeric data taking significantly more space than using native datatypes. A small level could easily end up as 1Mb+. I want to get these sizes down significantly, especially if the system is to be used for a game on the iPhone or other devices with relatively limited memory.
The optimal solution, for memory and performance, would be to convert the XML to a binary level format. But I don't want to do this. I want to keep the format fairly flexible. XML makes it very easy to add new attributes to objects, and give them a default value if an old version of the data is loaded. So I want to keep with the hierarchy of nodes, with attributes as name-value pairs.
But I need to store this in a more compact format - to remove the massive duplication of tag/attribute names. Maybe also to give attributes native types, so, for example floating-point data is stored as 4 bytes per float, not as a text string.
Google/Wikipedia reveal that 'binary XML' is hardly a new problem - it's been solved a number of times already. Has anyone here got experience with any of the existing systems/standards? - are any ideal for games use - with a free, lightweight and cross-platform parser/loader library (C/C++) available?
Or should I reinvent this wheel myself?
Or am I better off forgetting the ideal, and just compressing my raw .xml data (it should pack well with zip-like compression), and just taking the memory/performance hit on-load?