A faster alternative to Pandas `isin` function
- by user3576212
I have a very large data frame df that looks like:
ID Value1 Value2
1345 3.2 332
1355 2.2 32
2346 1.0 11
3456 8.9 322
And I have a list that contains a subset of IDs ID_list. I need to have a subset of df for the ID contained in ID_list.
Currently, I am using df_sub=df[df.ID.isin(ID_list)] to do it. But it takes a lot time. IDs contained in ID_list doesn't have any pattern, so it's not within certain range. (And I need to apply the same operation to many similar dataframes. I was wondering if there is any faster way to do this. Will it help a lot if make ID as the index?
Thanks!