How to parse large xml files on google app engine?

Posted by Alon Carmel on Stack Overflow See other posts from Stack Overflow or by Alon Carmel
Published on 2010-05-21T14:04:41Z Indexed on 2010/05/21 16:10 UTC
Read the original article Hit count: 308

Filed under:
|
|
|

Hey, I have fairly large xml file 1mb in size that i host on s3. I need to parse that xml file into my app engine datastore entirely.

I have written a simple DOM parser that works fine locally but online it reaches the 30sec error and stops.

I tried lowering the xml parsing by downloading the xml file into a BLOB at first before the parser then parse the xml file from blob. problem is that blobs are limited to 1mb. so it fails.

I have multiple inserts to the datastore which cause it to fail on 30 sec. i saw somewhere that they recommend using the Mapper class and save some exception where the process stopped but as i am a python n00b i cant figure out how to implement it on a DOM parser or an SAX one (please provide an example?) on how to use it.

i'm pretty much doing a bad thing right now and i parse the xml using php outside the app engine and push the data via HTTP post to the app engine using a proprietary API which works fine but is stupid and makes me maintain two codes.

can you please help me out?

© Stack Overflow or respective owner

Related posts about python

Related posts about google-app-engine