Searching a large list of words in another large list

Posted by Christian on Stack Overflow See other posts from Stack Overflow or by Christian
Published on 2010-03-31T23:31:39Z Indexed on 2010/03/31 23:33 UTC
Read the original article Hit count: 635

Filed under:

I have a list of 1,000,000 strings with a maximum length of 256 with protein names. Every string has an associated ID. I have another list of 4,000,000,000 strings with a maximum length of 256 with words out of articles and every word has an ID.

I want to find all matches between the list of protein names and the list of words of the articles. Which algorithm should I use? Should I use some prebuild API?

It would be good if the algorithm runs on a normal PC without special hardware.

© Stack Overflow or respective owner

Related posts about search