Big Data – Operational Databases Supporting Big Data – Key-Value Pair Databases and Document Databases – Day 13 of 21

Posted by Pinal Dave on SQL Authority See other posts from SQL Authority or by Pinal Dave
Published on Thu, 17 Oct 2013 01:30:47 +0000 Indexed on 2013/10/17 16:10 UTC
Read the original article Hit count: 453

In yesterday’s blog post we learned the importance of the Relational Database and NoSQL database in the Big Data Story. In this article we will understand the role of Key-Value Pair Databases and Document Databases Supporting Big Data Story.

Now we will see a few of the examples of the operational databases.

  • Relational Databases (Yesterday’s post)
  • NoSQL Databases (Yesterday’s post)
  • Key-Value Pair Databases (This post)
  • Document Databases (This post)
  • Columnar Databases (Tomorrow’s post)
  • Graph Databases (Tomorrow’s post)
  • Spatial Databases (Tomorrow’s post)

Key Value Pair Databases

Key Value Pair Databases are also known as KVP databases. A key is a field name and attribute, an identifier. The content of that field is its value, the data that is being identified and stored.

They have a very simple implementation of NoSQL database concepts. They do not have schema hence they are very flexible as well as scalable. The disadvantages of Key Value Pair (KVP) database are that they do not follow ACID (Atomicity, Consistency, Isolation, Durability) properties. Additionally, it will require data architects to plan for data placement, replication as well as high availability. In KVP databases the data is stored as strings.

Here is a simple example of how Key Value Database will look like:

Key Value
Name Pinal Dave
Color Blue
Twitter @pinaldave
Name Nupur Dave
Movie The Hero

As the number of users grow in Key Value Pair databases it starts getting difficult to manage the entire database. As there is no specific schema or rules associated with the database, there are chances that database grows exponentially as well. It is very crucial to select the right Key Value Pair Database which offers an additional set of tools to manage the data and provides finer control over various business aspects of the same.

Riak

Rick is one of the most popular Key Value Database. It is known for its scalability and performance in high volume and velocity database. Additionally, it implements a mechanism for collection key and values which further helps to build manageable system. We will further discuss Riak in future blog posts.

Key Value Databases are a good choice for social media, communities, caching layers for connecting other databases. In simpler words, whenever we required flexibility of the data storage keeping scalability in mind – KVP databases are good options to consider.

Document Database

There are two different kinds of document databases. 1) Full document Content (web pages, word docs etc) and 2) Storing Document Components for storage. The second types of the document database we are talking about over here. They use Javascript Object Notation (JSON) and Binary JSON for the structure of the documents. JSON is very easy to understand language and it is very easy to write for applications. There are two major structures of JSON used for Document Database – 1) Name Value Pairs and 2) Ordered List.

MongoDB and CouchDB are two of the most popular Open Source NonRelational Document Database.

MongoDB

MongoDB databases are called collections. Each collection is build of documents and each document is composed of fields. MongoDB collections can be indexed for optimal performance. MongoDB ecosystem is highly available, supports query services as well as MapReduce. It is often used in high volume content management system.

CouchDB

CouchDB databases are composed of documents which consists fields and attachments (known as description). It supports ACID properties. The main attraction points of CouchDB are that it will continue to operate even though network connectivity is sketchy. Due to this nature CouchDB prefers local data storage.

Document Database is a good choice of the database when users have to generate dynamic reports from elements which are changing very frequently. A good example of document usages is in real time analytics in social networking or content management system.

Tomorrow

In tomorrow’s blog post we will discuss about various other Operational Databases supporting Big Data.

Reference: Pinal Dave (http://blog.sqlauthority.com)


Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

© SQL Authority or respective owner

Related posts about Big Data

Related posts about PostADay