Is there a canonical source supporting "all-surrogates"?

Posted by user61852 on Programmers See other posts from Programmers or by user61852
Published on 2013-07-11T16:25:22Z Indexed on 2013/08/02 16:03 UTC
Read the original article Hit count: 280

Filed under:

Background

The "all-PK-must-be-surrogates" approach is not present in Codd's Relational Model or any SQL Standard (ANSI, ISO or other).

Canonical books seems to elude this restrictions too.

Oracle's own data dictionary scheme uses natural keys in some tables and surrogate keys in other tables. I mention this because these people must know a thing or two about RDBMS design.

PPDM (Professional Petroleum Data Management Association) recommend the same canonical books do:

Use surrogate keys as primary keys when:

There are no natural or business keys
Natural or business keys are bad ( change often )
The value of natural or business key is not known at the time of inserting record
Multicolumn natural keys ( usually several FK ) exceed three columns, which makes joins too verbose.

Also I have not found canonical source that says natural keys need to be immutable. All I find is that they need to be very estable, i.e need to be changed only in very rare ocassions, if ever.

I mention PPDM because these people must know a thing or two about RDBMS design too.

The origins of the "all-surrogates" approach seems to come from recommendations from some ORM frameworks.

It's true that the approach allows for rapid database modeling by not having to do much business analysis, but at the expense of maintainability and readability of the SQL code. Much prevision is made for something that may or may not happen in the future ( the natural PK changed so we will have to use the RDBMS cascade update funtionality ) at the expense of day-to-day task like having to join more tables in every query and having to write code for importing data between databases, an otherwise very strightfoward procedure (due to the need to avoid PK colisions and having to create stage/equivalence tables beforehand ).

Other argument is that indexes based on integers are faster, but that has to be supported with benchmarks. Obviously, long, varying varchars are not good for PK. But indexes based on short, fix-length varchar are almost as fast as integers.

The questions

- Is there any canonical source that supports the "all-PK-must-be-surrogates" approach ?

- Has Codd's relational model been superceded by a newer relational model ?

Developer IT