How to parse large xml files on google app engine?
Posted
by Alon Carmel
on Stack Overflow
See other posts from Stack Overflow
or by Alon Carmel
Published on 2010-05-21T14:04:41Z
Indexed on
2010/05/21
16:10 UTC
Read the original article
Hit count: 308
Hey, I have fairly large xml file 1mb in size that i host on s3. I need to parse that xml file into my app engine datastore entirely.
I have written a simple DOM parser that works fine locally but online it reaches the 30sec error and stops.
I tried lowering the xml parsing by downloading the xml file into a BLOB at first before the parser then parse the xml file from blob. problem is that blobs are limited to 1mb. so it fails.
I have multiple inserts to the datastore which cause it to fail on 30 sec. i saw somewhere that they recommend using the Mapper class and save some exception where the process stopped but as i am a python n00b i cant figure out how to implement it on a DOM parser or an SAX one (please provide an example?) on how to use it.
i'm pretty much doing a bad thing right now and i parse the xml using php outside the app engine and push the data via HTTP post to the app engine using a proprietary API which works fine but is stupid and makes me maintain two codes.
can you please help me out?
© Stack Overflow or respective owner