Process xml-like log file queue

Posted by Zsolt Botykai on Stack Overflow See other posts from Stack Overflow or by Zsolt Botykai
Published on 2010-05-04T08:44:08Z Indexed on 2010/05/04 8:48 UTC
Read the original article Hit count: 311

Filed under:

Xml

|

processing

|

large-files

|

python

|

xmlstarlet

Hi all,

first of all: I'm not a programmer, never was, although had learn a lot during my professional carreer as a support consultant.

Now my task is to process - and create some statistics about a constantly written and rapidly growing XML like log file. It's not valid XML, because it does not have a proper <root> element, e.g. the log looks like this:

<log itemdate="somedate">
  <field id="0" />
  ...
</log>

<log itemdate="somedate+1">
  <field id="0" />
  ...
</log>

<log itemdate="somedate+n">
  <field id="0" />
  ...
</log>

E.g. I have to count all the items with field id=0. But most of the solutions I had found (e.g. using XPath) reports an error about the garbage after the first closing </log>.

Most probably I can use python (2.6, although I can compile 3.x as well), or some really old perl version (5.6.x), and recently compiled xmlstarlet which really looks promising - I was able to create the statistics for a certain period after copying the file, and pre- & appending the opening and closing root element. But this is a huge file and copying takes time as well. Isn't there a better solution?

Thanks in advance!

© Stack Overflow or respective owner

Related posts about Xml

Store XML,update record in XML,retrive a specific record in XML stored on BB device

as seen on Stack Overflow - Search for 'Stack Overflow'
I am writing a blackberry application where i want to store the data returned by a web service in my BB device.Earlier i was going to use SQLite for storing the data in mobile but as i googled and also did programming using SQLite and found that some BB devices dont support SQLite library and fail… >>> More
gwt+xml- can i read through incomplete XML using the GWT XML Parser

as seen on Stack Overflow - Search for 'Stack Overflow'
I have a requirement where a user is typing in XML in a text area, and I want to show the various nodes in a tree...But as the user is typing in the xml, it wont be a complete xml (since he is still typing in the XML)... How do I read an incomplete XML and correctly generate the tree? I understand… >>> More
perl xml parser get xml content within xml

as seen on Stack Overflow - Search for 'Stack Overflow'
How can I use XMLParser to get the item-@url, item-@replace and item-"value inside" for the content as a string of the node where item-@cone="one"? <cstep> <item cone="one" url="http://google.com/{ccc}/cthree" replace="{ccc}"> <itemsub conesub="conesub"> … >>> More
Reading php generated XML in flash?

as seen on Stack Overflow - Search for 'Stack Overflow'
Here is part 1 of our problem (Loading a dynamically generated XML file as PHP in Flash). Now we were able to get Flash to read the XML file, but we can only see the Flash render correctly when tested(test movie) from the actual Flash program. However, when we upload our files online to preview the… >>> More
Announcing RSS feeds of Microsoft All-In-One Code Framework code samples

as seen on Geeks with Blogs - Search for 'Geeks with Blogs'
Today, we are not only announcing Sample Browser v2 CTP, but we are also excited to announce the availability of RSS feeds of All-In-One Code Framework code samples. By using these feeds, you can easily track and download the new code samples. English RSS feeds All code samples: http://support… >>> More

Related posts about processing

configure Squid3 proxy server on Ubuntu with caching and logging

as seen on Server Fault - Search for 'Server Fault'
I have a ubuntu 11.10 machine. Installed Squid3. When i configure the squid as http_access allow all, everything works fine. my current configuration mostly default is as follows: 2012/09/10 13:19:57| Processing Configuration File: /etc/squid3/squid.conf (depth 0) 2012/09/10 13:19:57| Processing:… >>> More
apt-get fails to upgrade, install, remove etc

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I upgraded from 11.10 to 12.04, had no issues that I noticed. Recently tried to install something via software center, but it was throwing errors. Changed to trying to sudo apt-get install instead but again no luck. I've genuinely tried as much as I know to fix this, but I can't so I figured I'd ask… >>> More
Processing a tab delimited file with shell script processing

as seen on Stack Overflow - Search for 'Stack Overflow'
Hello, normally I would use Python/Perl for this procedure but I find myself (for political reasons) having to pull this off using a bash shell. I have a large tab delimited file that contains six columns and the second column is integers. I need to shell script a solution that would verify that… >>> More
Processing xml file VS. processing excel file from .Net

as seen on Stack Overflow - Search for 'Stack Overflow'
Hello All, I would like to ask what should be faster: reading excel file from .Net or reading xml file which contains the same data. The same is for writing. Thank you very much in advance. mayap. >>> More
configure squid3 to set up a web proxy in ubuntu12.04

as seen on Super User - Search for 'Super User'
I am in a LAN and have to use a proxy given to access the web in a very limited way. I can't even use google, github.com or SE sites. However I can use ssh to log into a server, which I have root access so basically I can do anything I want with it. So I was thinking that maybe I could use that server… >>> More