Find all duplicate files by md5 hash

Posted by Jamie Curran on Super User See other posts from Super User or by Jamie Curran
Published on 2012-10-14T21:31:33Z Indexed on 2012/10/15 3:42 UTC
Read the original article Hit count: 504

Filed under:
|

I'm trying to find all duplicate files based upon md5 hash and ordered by file size. So far I have this:

 find . -type f -print0 | xargs -0 -I "{}" sh -c 'md5sum "{}" |  cut -f1 -d " " | tr "\n" " "; du -h "{}"' | sort -h -k2 -r | uniq -w32 --all-repeated=separate

The output of this is:

1832348bb0c3b0b8a637a3eaf13d9f22 4.0K   ./picture.sh
1832348bb0c3b0b8a637a3eaf13d9f22 4.0K   ./picture2.sh
1832348bb0c3b0b8a637a3eaf13d9f22 4.0K   ./picture2.s

d41d8cd98f00b204e9800998ecf8427e 0      ./test(1).log

Is this the most efficient way?

© Super User or respective owner

Related posts about linux

Related posts about sysadmin