How to lazy load a data structure (python)
- by Anton Geraschenko
I have some way of building a data structure (out of some file contents, say):
def loadfile(FILE):
return # some data structure created from the contents of FILE
So I can do things like
puppies = loadfile("puppies.csv") # wait for loadfile to work
kitties = loadfile("kitties.csv") # wait some more
print len(puppies)
print puppies[32]
In the above example, I wasted a bunch of time actually reading kitties.csv and creating a data structure that I never used. I'd like to avoid that waste without constantly checking if not kitties whenever I want to do something. I'd like to be able to do
puppies = lazyload("puppies.csv") # instant
kitties = lazyload("kitties.csv") # instant
print len(puppies) # wait for loadfile
print puppies[32]
So if I don't ever try to do anything with kitties, loadfile("kitties.csv") never gets called.
Is there some standard way to do this?
After playing around with it for a bit, I produced the following solution, which appears to work correctly and is quite brief. Are there some alternatives? Are there drawbacks to using this approach that I should keep in mind?
class lazyload:
def __init__(self,FILE):
self.FILE = FILE
self.F = None
def __getattr__(self,name):
if not self.F:
print "loading %s" % self.FILE
self.F = loadfile(self.FILE)
return object.__getattribute__(self.F, name)
What might be even better is if something like this worked:
class lazyload:
def __init__(self,FILE):
self.FILE = FILE
def __getattr__(self,name):
self = loadfile(self.FILE) # this never gets called again
# since self is no longer a
# lazyload instance
return object.__getattribute__(self, name)
But this doesn't work because self is local. It actually ends up calling loadfile every time you do anything.