I'm writing a data gathering and reporting application that takes XML files as input, which will then be read, processed, and stored in a strongly-typed database. For example, an XML file for a "Job" might look like this:
<Data type="Job">
<ID>12345</ID>
<JobName>MyJob</JobName>
<StartDate>04/07/2012 10:45:00 AM</StartDate>
<Files>
<File name="a.jpeg" path="images\" />
<File name="b.mp3" path="music\mp3\" />
</Files>
</Data>
I'd like to use a schema to have a standard format for these input files (depending on what type of data is being used, for example "Job", "User", "View"), but I'd also like to not fail validation if there is extra data provided.
For example, perhaps a Job has additional properties such as "IsAutomated", "Requester", "EndDate", and so on. I don't particularly care about these extra properties. If they are included in the XML, I'll simply ignore them when I'm processing the XML file, and I'd like validation to do the same, without having to include in the schema every single possible property that a customer might provide me with.
Is there a standard way of providing such a schema, or of allowing such a general XML file that can still be validated without resorting to something as naïve (and potentially difficult to deal with) as the below?
<Data type="Job">
<Data name="ID">12345</Data>
. . .
<Data name="Files">
<Data name="File">
<Data name="Filename">a.jpeg</Data>
<Data name="path">images</Data>
. . .
</Data>
</Data>