Search Results

Search found 3 results on 1 pages for 'sotangochips'.

Page 1/1 | 1

Simulating Google Appengine's Task Queue with Gearman

- by sotangochips

One of the characteristics I love most about Google's Task Queue is its simplicity. More specifically, I love that it takes a URL and some parameters and then posts to that URL when the task queue is ready to execute the task. This structure means that the tasks are always executing the most current version of the code. Conversely, my gearman workers all run code within my django project -- so when I push a new version live, I have to kill off the old worker and run a new one so that it uses the current version of the code. My goal is to have the task queue be independent from the code base so that I can push a new live version without restarting any workers. So, I got to thinking: why not make tasks executable by url just like the google app engine task queue? The process would work like this: User request comes in and triggers a few tasks that shouldn't be blocking. Each task has a unique URL, so I enqueue a gearman task to POST to the specified URL. The gearman server finds a worker, passes the url and post data to a worker The worker simply posts to the url with the data, thus executing the task. Assume the following: Each request from a gearman worker is signed somehow so that we know it's coming from a gearman server and not a malicious request. Tasks are limited to run in less than 10 seconds (There would be no long tasks that could timeout) What are the potential pitfalls of such an approach? Here's one that worries me: The server can potentially get hammered with many requests all at once that are triggered by a previous request. So one user request might entail 10 concurrent http requests. I suppose I could have a single worker with a sleep before every request to rate-limit. Any thoughts?

Read the article
Right way to return proxy model instance from a base model instance in Django ?

- by sotangochips

Say I have models: class Animal(models.Model): type = models.CharField(max_length=255) class Dog(Animal): def make_sound(self): print "Woof!" class Meta: proxy = True class Cat(Animal): def make_sound(self): print "Meow!" class Meta: proxy = True Let's say I want to do: animals = Animal.objects.all() for animal in animals: animal.make_sound() I want to get back a series of Woofs and Meows. Clearly, I could just define a make_sound in the original model that forks based on animal_type, but then every time I add a new animal type (imagine they're in different apps), I'd have to go in and edit that make_sound function. I'd rather just define proxy models and have them define the behavior themselves. From what I can tell, there's no way of returning mixed Cat or Dog instances, but I figured maybe I could define a "get_proxy_model" method on the main class that returns a cat or a dog model. Surely you could do this, and pass something like the primary key and then just do Cat.objects.get(pk = passed_in_primary_key). But that'd mean doing an extra query for data you already have which seems redundant. Is there any way to turn an animal into a cat or a dog instance in an efficient way? What's the right way to do what I want to achieve?

Read the article
Delivering activity feed items in a moderately scalable way

- by sotangochips

The application I'm working on has an activity feed where each user can see their friends' activity (much like Facebook). I'm looking for a moderately scalable way to show a given users' activity stream on the fly. I say 'moderately' because I'm looking to do this with just a database (Postgresql) and maybe memcached. For instance, I want this solution to scale to 200k users each with 100 friends. Currently, there is a master activity table that stores the rendered html for the given activity (Jim added a friend, George installed an application, etc.). This master activity table keeps the source user, the html, and a timestamp. Then, there's a separate ('join') table that simply keeps a pointer to the person who should see this activity in their friend feed, and a pointer to the object in the main activity table. So, if I have 100 friends, and I do 3 activities, then the join table will then grow to 300 items. Clearly this table will grow very quickly. It has the nice property, though, that fetching activity to show to a user takes a single (relatively) inexpensive query. The other option is to just keep the main activity table and query it by saying something like: select * from activity where source_user in (1, 2, 44, 2423, ... my friend list) This has the disadvantage that you're querying for users who may never be active, and as your friend list grows, this query can get slower and slower. I see the pros and the cons of both sides, but I'm wondering if some SO folks might help me weigh the options and suggest one way or they other. I'm also open to other solutions, though I'd like to keep it simple and not install something like CouchDB, etc. Many thanks!

Read the article

1