How to do this Python / MySQL manipulation (match) more efficiently?

Posted by NJTechie on Stack Overflow See other posts from Stack Overflow or by NJTechie
Published on 2010-04-15T19:39:51Z Indexed on 2010/04/15 19:43 UTC
Read the original article Hit count: 315

Filed under:
|
|
|

Following is my data :

Company Table : 

ID Company Address    City    State Zip    Phone
1   ABC    123 Oak St Philly  PA    17542  7329878901
2   CDE    111 Joe St Newark  NJ    08654 
3   GHI    211 Foe St Brick   NJ    07740  7321178901
4   JAK    777 Wall   Ocean   NJ    07764  7322278901
5   KLE    87 Ilk St  Plains  NY    07654  7376578901
6   AB     1 W.House  SField  PA    87656  7329878901

Branch Office Table : 

ID  Address    City    State Zip    Phone
1   323 Alk St Philly  PA    17542  7329832221
1   171 Joe St Newark  NJ    08654 
3   287 Foe St Brick   NJ    07740  7321178901
3   700 Wall   Ocean   NJ    07764  7322278901
1   89 Blk St  Surrey  NY    07154  7376222901

File to be Matched (In MySQL): 

ID Company Address    City    State Zip    Phone
1   ABC    123 Oak St Philly  PA    17542  7329878901
2   AB     171 Joe St Newark  NJ    08654 
3   GHI    211 Foe St Brick   NJ    07740  7321178901
4   JAK    777 Wall   Ocean   NJ    07764  7322278901
5   K      87 Ilk St  Plains  NY    07654  7376578901

Resulting File : 

ID Company Address    City    State Zip    Phone       appendedID
1   ABC    123 Oak St Philly  PA    17542  7329878901       [Original record, field always empty]
1   ABC    171 Joe St Newark  NJ    08654              1    [Company Table] 
1   ABC    323 Alk St Philly  PA    17542  7329832221  1    [Branch Office Table]
1   AB     1 W.House  SField  PA    87656  7329878901  6    [Partial firm and State, Zip match]
2   CDE    111 Joe St Newark  NJ    08654 
3   GHI    211 Foe St Brick   NJ    07740  7321178901       
3   GHI    700 Wall   Ocean   NJ    07764  7322278901  3
3   GHI    287 Foe St Brick   NJ    07740  7321178901  3
4   JAK    777 Wall   Ocean   NJ    07764  7322278901  
5   KLE    87 Ilk St  Surrey  NY    07654  7376578901  
5   KLE    89 Blk St  Surrey  NY    07154  7376222901  5

Requirement :

1) I have to match each firm on the 'File to be Matched' to that of Company and Branch Office tables (MySQL). 2) If there are multiple exact/partial matches, then the ID from Company, Branch Office table is inserted as a new row in the resulting file. 3) Not all the firms will be matched perfectly, in that case I have to match on partial Company names (like 5/8th of the company name) and any of the address fields and insert them in the resulting file.

Please help me out in the most efficient solution for this problem.

© Stack Overflow or respective owner

Related posts about python

Related posts about mysql