How to scale out OpenStreetMap data efficiently
- by Pierre
For over a year now, I'm running an in-house PostGIS server filled with OSM data, used for both Mapnik-based tile generation and Nominatim-based geocoding, updated with day replicates. This works pretty well.
However, as usage is growing exponentially, I would like to achieve better reliability and performance by adding additional PostgreSQL servers. And I'm kind of lost.
Since PostgreSQL doesn't seem to handle replication by itself, I would think about using a piede of middleware like PgPool-II to keep the servers in sync. But I'm afraid it would be nothing but necessary for this usage : very high read-to-write ratio, where all writes are done at the same exact time every day.
My questions are simple : What would you do to keep these servers in sync? And, what is done for this at the OpenStreetMap Foundation, MapQuest, Mapbox or CloudMade?
Thanks.