Ways to implement tags - pros and cons of each
- by bobobobo
Related
Using SO as an example, what is the most sensible way to manage tags if you anticipate they will change often?
Way 1: Seriously denormalized (comma delimited)
table posts
+--------+-----------------+
| postId | tags |
+--------+-----------------+
| 1 | c++,search,code |
Here tags are comma delimited.
Pros: Tags are retrieved at once with a single select query. Updating tags is simple. Easy and cheap to update.
Cons: Extra parsing on tag retrieval, difficult to count how many posts use which tags.
(alternatively, if limited to something like 5 tags)
table posts
+--------+-------+-------+-------+-------+-------+
| postId | tag_1 | tag_2 | tag_3 | tag_4 | tag_5 |
+--------+-------+-------+-------+-------+-------+
| 1 | c++ |search | code | | |
Way 2: "Slightly normalized" (separate table, no intersection)
table posts
+--------+-------------------+
| postId | title |
+--------+-------------------+
| 1 | How do u tag? |
table taggings
+--------+---------+
| postId | tagName |
+--------+---------+
| 1 | C++ |
| 1 | search |
Pros: Easy to see tag counts (count(*) from taggings where tagName='C++').
Cons: tagName will likely be repeated many, many times.
Way 3: The cool kid's (normalized with intersection table)
table posts
+--------+---------------------------------------+
| postId | title |
+--------+---------------------------------------+
| 1 | Why is a raven like a writing desk? |
table tags
+--------+---------+
| tagId | tagName |
+--------+---------+
| 1 | C++ |
| 2 | search |
| 3 | foofle |
table taggings
+--------+---------+
| postId | tagId |
+--------+---------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
Pros:
No repeating tag names.
More girls will like you.
Cons: More expensive to change tags than way #1.