Determining whether a file is a duplicate
- by Todd R
Is there a reliable way to determine whether or not two files are the same? For example, two files with the same size and type may or may not be the same binarilly (yeah, I know it's not really a word). I assume that comparing one or two checksums of the files will help, but I wonder:
How reliable are checksums at determining whether two
files are different; what are the chances of two different files having the same checksum?
Would reliability increase by
applying additional checksum
comparisons?
Which checksum algorithm(s) would be
the most efficient and/or reliable?
Any ideas, suggestions or thoughts are appreciated!
P.S. The code for this is being written in Java running on a nix system, but generic or platform agnostic input is most helpful.