How best to merge/sort/page through tons of JSON arrays?

Posted by Joshiatto on Programmers See other posts from Programmers or by Joshiatto
Published on 2013-02-05T18:55:19Z Indexed on 2013/11/03 4:12 UTC
Read the original article Hit count: 239

Filed under:
|
|
|
|

Here's the scenario: Say you have millions of JSON documents stored as text files. Each JSON document is an array of "activity" objects, each of which contain a "created_datetime" attribute. What is the best way to merge/sort/filter/page through these activities via a web UI? For example, say we want to take a few thousand of the documents, merge them into a gigantic array, sort the array by the "created_datetime" attribute descending and then page through it 10 activities at a time.

Also keep in mind that roughly 25% of these JSON documents are updated every day, and updates have to make it into the view within 5 minutes.

My first thought is to parse all of the documents into an RDBMS table and then it would just be a simple query such as "select top 10 name, created_datetime from Activity where user_id=12345 order by created_datetime desc".

Some have suggested I use NoSQL techniques such as hadoop or map/reduce instead. How exactly would this work?

For more background, see: Why is NoSQL better for this scenario?

© Programmers or respective owner

Related posts about Performance

Related posts about data