How to lazy load a data structure (python)

Posted by Anton Geraschenko on Stack Overflow See other posts from Stack Overflow or by Anton Geraschenko
Published on 2010-12-29T23:26:03Z Indexed on 2010/12/29 23:54 UTC
Read the original article Hit count: 199

Filed under:
|

I have some way of building a data structure (out of some file contents, say):

def loadfile(FILE):
    return # some data structure created from the contents of FILE

So I can do things like

puppies = loadfile("puppies.csv") # wait for loadfile to work
kitties = loadfile("kitties.csv") # wait some more
print len(puppies)
print puppies[32]

In the above example, I wasted a bunch of time actually reading kitties.csv and creating a data structure that I never used. I'd like to avoid that waste without constantly checking if not kitties whenever I want to do something. I'd like to be able to do

puppies = lazyload("puppies.csv") # instant
kitties = lazyload("kitties.csv") # instant
print len(puppies)                # wait for loadfile
print puppies[32]

So if I don't ever try to do anything with kitties, loadfile("kitties.csv") never gets called.

Is there some standard way to do this?

After playing around with it for a bit, I produced the following solution, which appears to work correctly and is quite brief. Are there some alternatives? Are there drawbacks to using this approach that I should keep in mind?

class lazyload:
    def __init__(self,FILE):
        self.FILE = FILE
        self.F = None
    def __getattr__(self,name):
        if not self.F: 
            print "loading %s" % self.FILE
            self.F = loadfile(self.FILE)
        return object.__getattribute__(self.F, name)

What might be even better is if something like this worked:

class lazyload:
    def __init__(self,FILE):
        self.FILE = FILE
    def __getattr__(self,name):
        self = loadfile(self.FILE) # this never gets called again
                                   # since self is no longer a
                                   # lazyload instance
        return object.__getattribute__(self, name)

But this doesn't work because self is local. It actually ends up calling loadfile every time you do anything.

© Stack Overflow or respective owner

Related posts about python

Related posts about lazy-loading