How to extract block of XML from a log file on Linux

Posted by dragonmantank on Stack Overflow See other posts from Stack Overflow or by dragonmantank
Published on 2010-05-14T01:55:59Z Indexed on 2010/05/14 2:04 UTC
Read the original article Hit count: 293

Filed under:
|
|
|
|

I have a log file that looks like the following:

2010-05-12 12:23:45 Some sort of log entry
2010-05-12 01:45:12 Request XML: <RootTag>
<Element>Value</Element>
<Element>Another Value</Element>
</RootTag>
2010-05-12 01:45:32 Response XML: <ResponseRoot>
<Element>Value</Element>
</ResponseRoot>
2010-05-12 01:45:49 Another log entry

What I want to do is extract the Request and Response XML (and ultimately dump them into their own single files). I had a similar parser that used egrep but the XML was all on one line, not multiple ones like above.

The log files are also somewhat large, hitting 500-600 megs a log. Smaller logs I would read in via a PHP script and use regex matching, but the amount of memory required for such a large file would more than likely kill the script.

Is there an easy way using the built-in tools on a Linux box (CentOS in this case) to extract multiple lines or am I going to have to bite the bullet and use Perl or PHP to read in the entire file to extract it?

© Stack Overflow or respective owner

Related posts about php

Related posts about perl