Process xml-like log file queue
- by Zsolt Botykai
Hi all,
first of all: I'm not a programmer, never was, although had learn a lot during my professional carreer as a support consultant.
Now my task is to process - and create some statistics about a constantly written and rapidly growing XML like log file. It's not valid XML, because it does not have a proper <root> element, e.g. the log looks like this:
<log itemdate="somedate">
<field id="0" />
...
</log>
<log itemdate="somedate+1">
<field id="0" />
...
</log>
<log itemdate="somedate+n">
<field id="0" />
...
</log>
E.g. I have to count all the items with field id=0. But most of the solutions I had found (e.g. using XPath) reports an error about the garbage after the first closing </log>.
Most probably I can use python (2.6, although I can compile 3.x as well), or some really old perl version (5.6.x), and recently compiled xmlstarlet which really looks promising - I was able to create the statistics for a certain period after copying the file, and pre- & appending the opening and closing root element. But this is a huge file and copying takes time as well. Isn't there a better solution?
Thanks in advance!