Relational vs. Dimensional Databases, what's the difference?

Posted by grautur on Stack Overflow See other posts from Stack Overflow or by grautur
Published on 2010-05-09T18:04:43Z Indexed on 2010/05/09 18:08 UTC
Read the original article Hit count: 444

I'm trying to learn about OLAP and data warehousing, and I'm confused about the difference between relational and dimensional modeling. Is dimensional modeling basically relational modeling, but allowing for redundant/un-normalized data?

For example, let's say I have historical sales data on (product, city, # sales). I understand that the following would be a relational point-of-view:

Product | City | # Sales
Apples, San Francisco, 400
Apples, Boston, 700
Apples, Seattle, 600
Oranges, San Francisco, 550
Oranges, Boston, 500
Oranges, Seattle, 600

While the following is a more dimensional point-of-view:

Product | San Francisco | Boston | Seattle
Apples, 400, 700, 600
Oranges, 550, 500, 600

But it seems like both points of view would nonetheless be implemented in an identical star schema:

Fact table: Product ID, Region ID, # Sales
Product dimension: Product ID, Product Name
City dimension: City ID, City Name

And it's not until you start adding some additional details to each dimension that the differences start popping up. For instance, if you wanted to track regions as well, a relational database would tend to have a separate region table, in order to keep everything normalized:

City dimension: City ID, City Name, Region ID
Region dimension: Region ID, Region Name, Region Manager, # Regional Stores

While a dimensional database would allow for denormalization to keep the region data inside the city dimension, in order to make it easier to slice the data:

City dimension: City ID, City Name, Region Name, Region Manager, # Regional Stores

Is this correct?

© Stack Overflow or respective owner

Related posts about databases

Related posts about database-design