many-to-many-to-many, incl alignment of data from diff sources

Posted by JefeCoon on Stack Overflow See other posts from Stack Overflow or by JefeCoon
Published on 2010-05-28T23:06:26Z Indexed on 2010/05/28 23:12 UTC
Read the original article Hit count: 220

Re-factoring dbase to support many:many:many. At the second and third levels we need to preserve end-user 'mapping' or aligning of data from different sources, e.g.

Order 17
 FirstpartyOrderID => aha
  LineItem_for_BigShinyThingy => AA-1  # maps to 77-a
  LineItem_for_BigShinyThingy => AA-2  # maps to 77-b, 77-c
  LineItem_for_LittleWidget =>   AA-x  # maps to 77-zulu, 77-alpha, 99-foxtrot
  LineItem_for_LittleWidget =>   AA-y  # maps to 77-zulu, 99-foxtrot
  LineItem_for_LittleWidget =>   AA-z  # maps to 77-alpha

 ThirdpartyOrderID => foo
  LineItem_for_BigShinyThingy => 77-a
  LineItem_for_BigShinyThingy => 77-b
  LineItem_for_BigShinyThingy => 77-c
  LineItem_for_LittleWidget =>   77-zulu
  LineItem_for_LittleWidget =>   77-alpha

 ThirdpartyOrderID => bar
  LineItem_for_LittleWidget =>   99-foxtrot

Each LineItem has daily datapoints reported from its own source (Firstparty|Thirdparty).

In our UI & app we provide tools to align these, then we'd like to save them into the cleanest possible schema for querying, enabling us to diff the reported daily datapoints, and perform other daily calculations (which we'll store in the dbase also, fortunately that should be cake once we've nailed this).

We need to map related [firstparty|thirdparty]line_items which have their own respective datapoints. We'll be using the association to pull each line_items collection of datapoints for summary and discrepancy calculations.

I'm considering two options, std has_many,through x2 --or-- possibly (scary) ubermasterjoin table

OptionA:

order<<-->>
       order_join_table[id,order_id,firstparty_order_id,thirdparty_order_id]
     <<-->>line_item
 order_join_table[firstparty_order_id]-->raw_order[id]
 order_join_table[thirdparty_order_id]-->raw_order[id]
 raw_order-->raw_line_items[raw_order_id]

line_item<<-->>
           line_item_join[id,LI_join_id,firstparty_LI,thirdparty_LI
         <<-->>raw_line_items
 line_item_join[firstparty_LI]-->raw_line_item[id]
 line_item_join[thirdparty_LI]-->raw_line_item[id]

raw_line_item<<-->>datapoints

=> we rely upon join to store all mappings of first|third orders & line_items
=> keys to raw_* enable lookup of these order & line_item details
=> concerns about circular references and/or lack of correct mapping logic, e.g
order-->line_item-->raw_line_items
vs.
order-->raw_order-->raw_line_items

OptionB:

order<<-->>
       join_master[id,order_id,FP_order_id,TP_order_id,FP_line_item_id,TP_line_item_id]
 join_master[FP_order_id & TP_order_id]-->raw_order[id]
 join_master[FP_line_item_id & TP_line_item_id]-->raw_line_item[id]

=> every combo of FP_line_item + TP_line_item writes a record into the join_master table
=> "theoretically" queries easy/fast/flexible/sexy

At long last, my questions:
a) any learnings from painful firsthand experience about how best to implement/tune/optimize many-to-many-to-many relationships
b) in rails?
c) any painful gotchas (circular references, slow queries, spaghetti-monsters) to watch out for?
d) any joy & goodness in Rails3 that makes this magically easy & joyful?
e) anyone written the "how to do many-to-many-to-many schema in Rails and make it fast & sexy?" tutorial that I somehow haven't found? If not, I'll follow up with our learnings in the hope it's helpful..

Thanks in advance-
--Jeff

© Stack Overflow or respective owner

Related posts about ruby-on-rails

Related posts about database-design