Marko Parkkola
This article has been submitted by Marko Parkkola, Data systems designer at Saarionen Oy, Finland. Marko is excellent developer and always thinking at next level. You can read his earlier comment which created very interesting discussion here: SQL SERVER- IF EXISTS(Select null from table) vs IF EXISTS(Select 1 from table). I must express my special thanks to Marko for sending this best practice for Enumerations in Relational Database. He has really wrote excellent piece here and welcome comments here.
Enumerations in Relational Database
This is a subject which is very basic thing in relational databases but often not very well understood and sometimes badly implemented. There are of course many ways to do this but I concentrate only two cases, one which is “the right way” and one which is definitely wrong way.
The concept
Let’s say we have table Person in our database. Person has properties/fields like Firstname, Lastname, Birthday and so on. Then there’s a field that tells person’s marital status and let’s name it the same way; MaritalStatus.
Now MaritalStatus is an enumeration. In C# I would definitely make it an enumeration with values likes Single, InRelationship, Married, Divorced. Now here comes the problem, SQL doesn’t have enumerations.
The wrong way
This is, in my opinion, absolutely the wrong way to do this. It has one upside though; you’ll see the enumeration’s description instantly when you do simple SELECT query and you don’t have to deal with mysterious values. There’s plenty of downsides too and one would be database fragmentation.
Consider this (I’ve left all indexes and constraints out of the query on purpose).
CREATE TABLE [dbo].[Person]
(
[Firstname] NVARCHAR(100),
[Lastname] NVARCHAR(100),
[Birthday] datetime,
[MaritalStatus] NVARCHAR(10)
)
You have nvarchar(20) field in the table that tells the marital status. Obvious problem with this is that what if you create a new value which doesn’t fit into 20 characters? You’ll have to come and alter the table. There are other problems also but I’ll leave those for the reader to think about.
The correct way
Here’s how I’ve done this in many projects. This model still has one problem but it can be alleviated in the application layer or with CHECK constraints if you like.
First I will create a namespace table which tells the name of the enumeration. I will add one row to it too. I’ll write all the indexes and constraints here too.
CREATE TABLE [CodeNamespace]
(
[Id] INT IDENTITY(1, 1),
[Name] NVARCHAR(100) NOT NULL,
CONSTRAINT [PK_CodeNamespace] PRIMARY KEY ([Id]),
CONSTRAINT [IXQ_CodeNamespace_Name] UNIQUE NONCLUSTERED ([Name])
)
GO
INSERT INTO [CodeNamespace] SELECT 'MaritalStatus'
GO
Then I create a table that holds the actual values and which reference to namespace table in order to group the values under different namespaces. I’ll add couple of rows here too.
CREATE TABLE [CodeValue]
(
[CodeNamespaceId] INT NOT NULL,
[Value] INT NOT NULL,
[Description] NVARCHAR(100) NOT NULL,
[OrderBy] INT,
CONSTRAINT [PK_CodeValue] PRIMARY KEY CLUSTERED ([CodeNamespaceId], [Value]),
CONSTRAINT [FK_CodeValue_CodeNamespace] FOREIGN KEY ([CodeNamespaceId]) REFERENCES [CodeNamespace] ([Id])
)
GO
-- 1 is the 'MaritalStatus' namespace
INSERT INTO [CodeValue] SELECT 1, 1, 'Single', 1
INSERT INTO [CodeValue] SELECT 1, 2, 'In relationship', 2
INSERT INTO [CodeValue] SELECT 1, 3, 'Married', 3
INSERT INTO [CodeValue] SELECT 1, 4, 'Divorced', 4
GO
Now there’s four columns in CodeValue table. CodeNamespaceId tells under which namespace values belongs to. Value tells the enumeration value which is used in Person table (I’ll show how this is done below). Description tells what the value means. You can use this, for example, column in UI’s combo box. OrderBy tells if the values needs to be ordered in some way when displayed in the UI.
And here’s the Person table again now with correct columns. I’ll add one row here to show how enumerations are to be used.
CREATE TABLE [dbo].[Person]
(
[Firstname] NVARCHAR(100),
[Lastname] NVARCHAR(100),
[Birthday] datetime,
[MaritalStatus] INT
)
GO
INSERT INTO [Person] SELECT 'Marko', 'Parkkola', '1977-03-04', 3
GO
Now I said earlier that there is one problem with this. MaritalStatus column doesn’t have any database enforced relationship to the CodeValue table so you can enter any value you like into this field. I’ve solved this problem in the application layer by selecting all the values from the CodeValue table and put them into a combobox / dropdownlist (with Value field as value and Description as text) so the end user can’t enter any illegal values; and of course I’ll check the entered value in data access layer also.
I said in the “The wrong way” section that there is one benefit to it. In fact, you can have the same benefit here by using a simple view, which I schema bound so you can even index it if you like.
CREATE VIEW [dbo].[Person_v]
WITH SCHEMABINDING
AS
SELECT p.[Firstname], p.[Lastname], p.[BirthDay], c.[Description] MaritalStatus
FROM [dbo].[Person] p
JOIN [dbo].[CodeValue] c ON p.[MaritalStatus] = c.[Value]
JOIN [dbo].[CodeNamespace] n ON n.[Id] = c.[CodeNamespaceId] AND n.[Name] = 'MaritalStatus'
GO
-- Select from View
SELECT *
FROM [dbo].[Person_v]
GO
This is excellent write up byMarko Parkkola. Do you have this kind of design setup at your organization? Let us know your opinion.
Reference: Pinal Dave (http://blog.SQLAuthority.com)
Filed under: Best Practices, Database, DBA, Readers Contribution, Software Development, SQL, SQL Authority, SQL Documentation, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology