How would you structure your entity model for storing arbitrary key/value data with different data t
- by Nathan Ridley
I keep coming across scenarios where it will be useful to store a set of arbitrary data in a table using a per-row key/value model, rather than a rigid column/field model. The problem is, I want to store the values with their correct data type rather than converting everything to a string. This means I have to choose either a single table with multiple nullable columns, one for each data type, or a set of value tables, one for each data type. I'm also unsure as to whether I should use full third normal form and separate the keys into a separate table, referencing them via a foreign key from the value table(s), or if it would be better to keep things simple and store the string keys in the value table(s) and accept the duplication of strings.
Old/bad:
This solution makes adding additional values a pain in a fluid environment because the table needs to be modified regularly.
MyTable
============================
ID Key1 Key2 Key3
int int string date
----------------------------
1 Value1 Value2 Value3
2 Value4 Value5 Value6
Single Table Solution
This solution allows simplicity via a single table. The querying code still needs to check for nulls to determine which data type the field is storing. A check constraint is probably also required to ensure only one of the value fields contains non-nulll data.
DataValues
=============================================================
ID RecordID Key IntValue StringValue DateValue
int int string int string date
-------------------------------------------------------------
1 1 Key1 Value1 NULL NULL
2 1 Key2 NULL Value2 NULL
3 1 Key3 NULL NULL Value3
4 2 Key1 Value4 NULL NULL
5 2 Key2 NULL Value5 NULL
6 2 Key3 NULL NULL Value6
Multiple-Table Solution
This solution allows for more concise purposing of each table, though the code needs to know the data type in advance as it needs to query a different table for each data type. Indexing is probably simpler and more efficient because there are less columns that need indexing.
IntegerValues
===============================
ID RecordID Key Value
int int string int
-------------------------------
1 1 Key1 Value1
2 2 Key1 Value4
StringValues
===============================
ID RecordID Key Value
int int string string
-------------------------------
1 1 Key2 Value2
2 2 Key2 Value5
DateValues
===============================
ID RecordID Key Value
int int string date
-------------------------------
1 1 Key3 Value3
2 2 Key3 Value6
How do you approach this problem? Which solution is better? Also, should the key column be separated into a separate table and referenced via a foreign key or be should it be kept in the value table and bulk updated if for some reason the key name changes?