Process xml-like log file queue
Posted
by Zsolt Botykai
on Stack Overflow
See other posts from Stack Overflow
or by Zsolt Botykai
Published on 2010-05-04T08:44:08Z
Indexed on
2010/05/04
8:48 UTC
Read the original article
Hit count: 244
Hi all,
first of all: I'm not a programmer, never was, although had learn a lot during my professional carreer as a support consultant.
Now my task is to process - and create some statistics about a constantly written and rapidly growing XML like log file. It's not valid XML, because it does not have a proper <root>
element, e.g. the log looks like this:
<log itemdate="somedate">
<field id="0" />
...
</log>
<log itemdate="somedate+1">
<field id="0" />
...
</log>
<log itemdate="somedate+n">
<field id="0" />
...
</log>
E.g. I have to count all the items with field id=0. But most of the solutions I had found (e.g. using XPath) reports an error about the garbage after the first closing </log>
.
Most probably I can use python (2.6, although I can compile 3.x as well), or some really old perl version (5.6.x), and recently compiled xmlstarlet which really looks promising - I was able to create the statistics for a certain period after copying the file, and pre- & appending the opening and closing root element. But this is a huge file and copying takes time as well. Isn't there a better solution?
Thanks in advance!
© Stack Overflow or respective owner