Searching through large data set

Posted by calccrypto on Stack Overflow See other posts from Stack Overflow or by calccrypto
Published on 2010-05-17T21:23:01Z Indexed on 2010/05/17 22:50 UTC
Read the original article Hit count: 288

how would i search through a list with ~5 mil 128bit (or 256, depending on how you look at it) strings quickly and find the duplicates (in python)? i can turn the strings into numbers, but i don't think that's going to help much. since i haven't learned much information theory, is there anything about this in information theory?

and since these are hashes already, there's no point in hashing them again

© Stack Overflow or respective owner

Related posts about python

Related posts about arrays