I am tasked with setting up conditional profiling - a method of tagging chunks of XML with an attribute, which will then be used as a conditional value to extract subset of that XML.
Have a look at another definition/example: DITA profiling
The XML is documents that are equivalent to printed books - i.e. documents that are often looked at by a human, even if indirectly.
Therefore I am looking at a few requirements here:
1. keeping the value list brief - so it doesn't affect the readability of the document
2. be able to process with standard XML tools - a space-separated list inside an attribute is still probably fine, but I'd rather not use too much regexp for this
3. be obvious for various users, including 3rd parties, which content goes where
4. Be easy to maintain going forward
Therefore one easy solution is:
The problem with this:
1. As the list grows the value of the attribute can be a bit verbose
2. One needs to explicitly state every value even if it's a scenario of this vs everything else
Therefore I am also looking at other approaches such as:
1. Using + and - modifiers, Apache htaccess style to override the default cascading of profiling - by default all content goes everywhere and if we want to exclude a bit we just say "-kindle". It does require parsing the whole tree, is not supported by editing tools and one needs to regexp the attribute value a bit deeper...
2. Using an intermediate file to define groups of values such as "other" or "non-print", example of this in DITA. It allows concise XML as well as different grouping and values for each document but it does create a certain level of abstraction which may make it a little less obvious for a 3rd party?
Altogether, if you received such XML and were tasked to process it, which option you'd rather receive?
If you have any experiences like that, even in an unrelated areas such a builds, don't hesitate to comment!