How to obtain the root of a tree without parsing the entire file?

Posted by Matt. on Stack Overflow See other posts from Stack Overflow or by Matt.
Published on 2011-03-01T22:05:16Z Indexed on 2011/03/01 23:25 UTC
Read the original article Hit count: 321

Filed under:
|
|

I'm making an xml parser to parse xml reports from different tools, and each tool generates different reports with different tags.

For example:

Arachni generates an xml report with <arachni_report></arachni_report> as tree root tag.

nmap generates an xml report with <nmaprun></nmaprun> as tree root tag.

I'm trying not to parse the entire file unless it's a valid report from any of the tools I want.

First thing I thought to use was ElementTree, parse the entire xml file (supposing it contains valid xml), and then check based on the tree root if the report belongs to Arachni or nmap.

I'm currently using cElementTree, and as far as I know getroot() is not an option here, but my goal is to make this parser to operate with recognized files only, without parsing unnecessary files.

By the way, I'm Still learning about xml parsing, thanks in advance.

© Stack Overflow or respective owner

Related posts about python

Related posts about elementtree