Using Python to traverse a parent-child data set

Posted by user132748 on Programmers See other posts from Programmers or by user132748
Published on 2014-05-28T01:15:20Z Indexed on 2014/05/28 3:59 UTC
Read the original article Hit count: 272

Filed under:
|

I have a dataset of two columns in a csv file. Th purpose of this dataset is to provide a linking between two different id's if they belong to the same person. e.g (2,3,5 belong to 1) e.g

  1. COLA COLB 1 2 ; 1 3 ; 1 5 ; 2 6 ; 3 7 ; 9 10

In the above example 1 is linked to 2,3,5 and 2 is the linked to 6 and 3 is linked to 7. What I am trying to achieve is to identify all records which are linked to 1 directly (2,3,5) or indirectly(6,7) and be able to say that these id's in column B belong to same person in column A and then either dedupe or add a new column to the output file which will have 1 populated for all rows that link to 1

e.g of expected output

  • colA colB GroupField 1 2 1; 1 3 1; 1 5 1 ; 2 6 1 ;3 7 1; 9 10 9; 10 11 9

I am a newbie and so am not sure on how to approach this problem.Appreciate any inputs you'll can provide.

© Programmers or respective owner

Related posts about python

Related posts about recursion