How to optimize a postgreSQL server for a "write once, read many"-type infrastructure ?
Posted
by mhu
on Server Fault
See other posts from Server Fault
or by mhu
Published on 2010-04-01T12:59:44Z
Indexed on
2010/04/02
6:33 UTC
Read the original article
Hit count: 225
postgresql
|optimization
Greetings,
I am working on a piece of software that logs entries (and related tagging) in a PostgreSQL database for storage and retrieval. We never update any data once it has been inserted; we might remove it when the entry gets too old, but this is done at most once a day. Stored entries can be retrieved by users.
The insertion of new entries can happen rather fast and regularly, thus the database will commonly hold several millions elements.
The tables used are pretty simple : one table for ids, raw content and insertion date; and one table storing tags and their values associated to an id. User search mostly concern tags values, so SELECTs usually consist of JOIN queries on ids on the two tables.
To sum it up :
- 2 tables
- Lots of INSERT
- no UPDATE
- some DELETE, once a day at most
- some user-generated SELECT with JOIN
- huge data set
What would an optimal server configuration (software and hardware, I assume for example that RAID10 could help) be for my PostgreSQL server, given these requirements ? By optimal, I mean one that allows SELECT queries taking a reasonably little amount of time.
I can provide more information about the current setup (like tables, indexes ...) if needed.
© Server Fault or respective owner