Using hashing to group similar records
Posted
by Neil Dobson
on Stack Overflow
See other posts from Stack Overflow
or by Neil Dobson
Published on 2010-05-22T01:30:41Z
Indexed on
2010/05/22
1:40 UTC
Read the original article
Hit count: 275
database-design
|hash
I work for a fulfillment company and we have to pack and ship many orders from our warehouse to customers. To improve efficiency we would like to group identical orders and pack these in the most optimum way. By identical I mean having the same number of order lines containing the same SKUs and same order quantities.
To achieve this I was thinking about hashing each order. We can then group by hash to quickly see which orders are the same.
We are moving from an Access database to a PostgreSQL database and we have .NET based systems for data loading and general order processing systems, so we can either do the hashing during the data loading or hand this task over to the DB.
My question firstly is should the hashing be managed by DB, possibly using triggers, or should the hash be created on-the-fly using a view or something?
And secondly would it be best to calculate a hash for each order line and then to combine these to find an order-level hash for grouping, or should I just use a trigger for all CRUD operations on the order lines table which re-calculates a single hash for the entire order and store the value in the orders table?
TIA
© Stack Overflow or respective owner