Efficient job progress update in web application
- by Endru6
Hi,
Creating a web application (Django in my case, but I think the question is more general) that is administrating a cluster of workers doing queued jobs, there is a need to track each jobs progress.
When I've done it using database UPDATE (PostgreSQL in this case), it severely hits the database performance, because each UPDATE creates a new row in a table, and in my case only vacuuming DB removes obsolete rows. Having 30 jobs running and reporting progress every 1 minute DB may require vacuuming (and it means huge slow downs on a front end side for all the employees working with the system) every 10 days.
Because the progress information isn't critical, ie. it doesn't have to be persistent, how would you do the progress updates from jobs without using an overhead database implies? There are 30 worker servers, each doing 1 or 2 jobs simultaneously, 1 front end server which serves a web application to users, and 1 database server.