Finding a list of indices from master array using secondary array with non-unique entries
- by fideli
I have a master array of length n of id numbers that apply to other analogous arrays with corresponding data for elements in my simulation that belong to those id numbers (e.g. data[id]). Were I to generate a list of id numbers of length m separately and need the information in the data array for those ids, what is the best method of getting a list of indices idx of the original array of ids in order to extract data[idx]? That is, given:
a=numpy.array([1,3,4,5,6]) # master array
b=numpy.array([3,4,3,6,4,1,5]) # secondary array
I would like to generate
idx=numpy.array([1,2,1,4,2,0,3])
The array a is typically in sequential order but it's not a requirement. Also, array b will most definitely have repeats and will not be in any order.
My current method of doing this is:
idx=numpy.array([numpy.where(a==bi)[0][0] for bi in b])
I timed it using the following test:
a=(numpy.random.uniform(100,size=100)).astype('int')
b=numpy.repeat(a,100)
timeit method1(a,b)
10 loops, best of 3: 53.1 ms per loop
Is there a better way of doing this?