Storing large json strings to database + hash
- by Guy
I need to store quiete large JSON data strings to the database. I am using gzip to compress the string and therefore BLOB MySQL data type to store it. However, only 5% of all the requests contain unique data and only unique data ought to be stored to the database.
My approach is as follows.
array_multisort data (array [a, b, c] is virtually the same as [a, c, b]).
json_encode data (json_encode is faster than serialize; we need string array representation for the step 3).
sha1 data (slower than md5, though less possible the collisions).
Check if the hash exists in the database.
5.1 yes – do not insert the data.
5.2. no – gzip the data and store it along the hash.
Is there anything about this (apart from storing JSON data to the database in the first place) that sounds fishy or should be done a different way?
p.s. We are talking about a database with roughly 1kk unique records being created every month.