Python implementation of avro slow?

Posted by lazy1 on Stack Overflow See other posts from Stack Overflow or by lazy1
Published on 2011-05-05T21:15:53Z Indexed on 2012/11/16 23:00 UTC
Read the original article Hit count: 220

Filed under:
|

I'm reading some data from avro file using the avro library. It takes about a minute to load 33K objects from the file. This seem very slow to me, specially with the Java version reading the same file in about 1sec.

Here is the code, am I doing something wrong?

import avro.datafile
import avro.io
from time import time

def load(filename):
    fo = open(filename, "rb")
    reader = avro.datafile.DataFileReader(fo, avro.io.DatumReader())
    for i, record in enumerate(reader):
        pass

    return i + 1

def main(argv=None):
    import sys
    from argparse import ArgumentParser

    argv = argv or sys.argv

    parser = ArgumentParser(description="Read avro file")


    start = time()
    num_records = load("events.avro")
    end = time()

    print("{0} records in {1} seconds".format(num_records, end - start))

if __name__ == "__main__":
    main()

© Stack Overflow or respective owner

Related posts about python

Related posts about avro