How to optimize a postgreSQL server for a "write once, read many"-type infrastructure ?

Posted by mhu on Stack Overflow See other posts from Stack Overflow or by mhu
Published on 2010-04-01T12:59:44Z Indexed on 2010/04/01 13:03 UTC
Read the original article Hit count: 221

Filed under:
|

Greetings,

I am working on a piece of software that logs entries (and related tagging) in a PostgreSQL database for storage and retrieval. We never update any data once it has been inserted; we might remove it when the entry gets too old, but this is done at most once a day. Stored entries can be retrieved by users.

The insertion of new entries can happen rather fast and regularly, thus the database will commonly hold several millions elements.

The tables used are pretty simple : one table for ids, raw content and insertion date; and one table storing tags and their values associated to an id. User search mostly concern tags values, so SELECTs usually consist of JOIN queries on ids on the two tables.

To sum it up :

  • 2 tables
  • Lots of INSERT
  • no UPDATE
  • some DELETE, once a day at most
  • some user-generated SELECT with JOIN
  • huge data set

What would an optimal server configuration (software and hardware, I assume for example that RAID10 could help) be for my PostgreSQL server, given these requirements ? By optimal, I mean one that allows SELECT queries taking a reasonably little amount of time.

I can provide more information about the current setup (like tables, indexes ...) if needed.

© Stack Overflow or respective owner

Related posts about postgresql

Related posts about optimization