writing a fast parser in python
- by panzi
I've written a hands-on recursive pure python parser for a some file format (ARFF) we use in one lecture. Now running my exercise submission is awfully slow. Turns out by far the most time is spent in my parser. It's consuming a lot of CPU time, the HD is not the bottleneck.
I wonder what performant ways are there to write a parser in python? I'd rather not rewrite it in C. I tried to use jython, but that decreased performance a lot! The files I parse are partially huge ( 150 MB) with very long lines.
My current parser only needs a look-ahead of one character. I'd post the source here but I don't know if that's such a good idea. After all the submission deadline has not jet ended. But then, the focus in this exercise is not the parser. You can choose whatever language you want to use and there already is a parser for Java.