This question concerns two implementations of counters which are intended to scale without sharding (with a tradeoff that they might under-count in some situations):
http://appengine-cookbook.appspot.com/recipe/high-concurrency-counters-without-sharding/ (the code in the comments)
http://blog.notdot.net/2010/04/High-concurrency-counters-without-sharding
My questions:
With respect to #1: Running memcache.decr() in a deferred, transactional task seems like overkill. If memcache.decr() is done outside the transaction, I think the worst-case is the transaction fails and we miss counting whatever we decremented. Am I overlooking some other problem that could occur by doing this?
What are the significiant tradeoffs between the two implementations?
Here are the tradeoffs I see:
#2 does not require datastore transactions.
To get the counter's value, #2 requires a datastore fetch while with #1 typically only needs to do a memcache.get() and memcache.add().
When incrementing a counter, both call memcache.incr(). Periodically, #2 adds a task to the task queue while #1 transactionally performs a datastore get and put. #1 also always performs memcache.add() (to test whether it is time to persist the counter to the datastore).
Conclusions
(without actually running any performance tests):
#1 should typically be faster at retrieving a counter (#1 memcache vs #2 datastore). Though #1 has to perform an extra memcache.add() too.
However, #2 should be faster when updating counters (#1 datastore get+put vs #2 enqueue a task).
On the other hand, with #1 you have to be a bit more careful with the update interval since the task queue quota is almost 100x smaller than either the datastore or memcahce APIs.