generating and unmarshalling java classes while unmarshalling input contains a DTD
- by Hans Westerbeek
Hi,
For a Spring-based project, I have the following situation to solve:
I have XML files coming in whose contents I will have to parse at runtime. Those XML files come with a DTD reference.
I need to generate the classes that the unmarshaller churns out using the right at build time, using the Maven2 plugin for the unmarshalling library of choice. This is also not very hard to do, once I have generated an XSD from the DTD.
I want to use spring-oxm's UnMarshaller interface to do the unmarshalling at runtime. This I understand how to do.
The xml files come in with a DTD reference, and all unmarshalling libraries out there want to do unmarshalling based on an XSD. Now, as described in the castor documentation, I can convert the DTD to an XSD and keep it on the classpath. However, when an actual XML file comes into the system it will still have that DTD reference at the top, and there's nothing I can really do about that (except for string replacing which feels hacky in this case). Will this cause the unmarshaller, like Castor to fail?
Am I right in suspecting that this DTD reference will cause the unmarshalling to fail? Could I do pure DTD-based unmarshalling? Or can this somehow be prevented by providing detailed configuration to the unmarshaller?
Until now, I have tried castor, xmlbeans and xstream. Which would fit my purposes best? Has anyone else been in this situation? Did you also end up just doing manual DOM or SAX parsing?