What is the fastest way to check if files are identical?
Posted
by ojblass
on Stack Overflow
See other posts from Stack Overflow
or by ojblass
Published on 2009-04-24T05:02:57Z
Indexed on
2010/05/07
10:28 UTC
Read the original article
Hit count: 200
If you have 1,000,0000 source files, you suspect they are all the same, and you want to compare them what is the current fasted method to compare those files? Assume they are Java files and platform where the comparison is done is not important. cksum is making me cry. When I mean identical I mean ALL identical.
Update: I know about generating checksums. diff is laughable ... I want speed.
Update: Don't get stuck on the fact they are source files. Pretend for example you took a million runs of a program with very regulated output. You want to prove all 1,000,000 versions of the output are the same.
Update: read the number of blocks rather than bytes? Immediatly throw out those? Is that faster than finding the number of bytes?
Update: Is this ANY different than the fastest way to compare two files?
© Stack Overflow or respective owner