Speeding up templates in GAE-Py by aggregating RPC calls
- by Sudhir Jonathan
Here's my problem:
class City(Model):
name = StringProperty()
class Author(Model):
name = StringProperty()
city = ReferenceProperty(City)
class Post(Model):
author = ReferenceProperty(Author)
content = StringProperty()
The code isn't important... its this django template:
{% for post in posts %}
<div>{{post.content}}</div>
<div>by {{post.author.name}} from {{post.author.city.name}}</div>
{% endfor %}
Now lets say I get the first 100 posts using Post.all().fetch(limit=100), and pass this list to the template - what happens?
It makes 200 more datastore gets - 100 to get each author, 100 to get each author's city.
This is perfectly understandable, actually, since the post only has a reference to the author, and the author only has a reference to the city. The __get__ accessor on the post.author and author.city objects transparently do a get and pull the data back (See this question).
Some ways around this are
Use Post.author.get_value_for_datastore(post) to collect the author keys (see the link above), and then do a batch get to get them all - the trouble here is that we need to re-construct a template data object... something which needs extra code and maintenance for each model and handler.
Write an accessor, say cached_author, that checks memcache for the author first and returns that - the problem here is that post.cached_author is going to be called 100 times, which could probably mean 100 memcache calls.
Hold a static key to object map (and refresh it maybe once in five minutes) if the data doesn't have to be very up to date. The cached_author accessor can then just refer to this map.
All these ideas need extra code and maintenance, and they're not very transparent. What if we could do
@prefetch
def render_template(path, data)
template.render(path, data)
Turns out we can... hooks and Guido's instrumentation module both prove it. If the @prefetch method wraps a template render by capturing which keys are requested we can (atleast to one level of depth) capture which keys are being requested, return mock objects, and do a batch get on them. This could be repeated for all depth levels, till no new keys are being requested. The final render could intercept the gets and return the objects from a map.
This would change a total of 200 gets into 3, transparently and without any extra code. Not to mention greatly cut down the need for memcache and help in situations where memcache can't be used.
Trouble is I don't know how to do it (yet). Before I start trying, has anyone else done this? Or does anyone want to help? Or do you see a massive flaw in the plan?