Why is my django bulk database population so slow and frequently failing?

Posted by bryn on Stack Overflow See other posts from Stack Overflow or by bryn
Published on 2011-01-14T04:34:45Z Indexed on 2011/01/14 4:53 UTC
Read the original article Hit count: 239

Filed under:
|
|
|

I decided I'd like to use django's model system rather than coding raw SQL to interface with my database, but I am having a problem that surely is avoidable.

My models.py contains:

class Student(models.Model):
    student_id = models.IntegerField(unique = True)
    form = models.CharField(max_length = 10)
    preferred = models.CharField(max_length = 70)
    surname = models.CharField(max_length = 70)

and I'm populating it by looping through a list as follows:

from models import Student

for id, frm, pref, sname in large_list_of_data: 
   s = Student(student_id = id, form = frm, preferred = pref, surname = sname)
   s.save()

I don't really want to be saving this to the database each time but I don't know another way to get django to not forget about it (I'd rather add all the rows and then do a single commit).

There are two problems with the code as it stands.

  1. It's slow -- about 20 students get updated each second.

  2. It doesn't even make it through large_list_of_data, instead throwing a DatabaseError saying "unable to open database file". (Possibly because I'm using sqlite3.)

My question is: How can I stop these two things from happening? I'm guessing that the root of both problems is that I've got the s.save() but I don't see a way of easily batching the students up and then saving them in one commit to the database.

© Stack Overflow or respective owner

Related posts about database

Related posts about django