Good way to find duplicate files?
Posted
by OverTheRainbow
on Stack Overflow
See other posts from Stack Overflow
or by OverTheRainbow
Published on 2010-04-01T07:42:22Z
Indexed on
2010/04/01
7:53 UTC
Read the original article
Hit count: 391
vb.net
|duplicates
Hello
I don't know enough about VB.Net (2008, Express Edition) yet, so I wanted to ask if there were a better way to find files with different names but the same contents, ie. duplicates.
In the following code, I use GetFiles() to retrieve all the files in a given directory, and for each file, use MD5 to hash its contents, check if this value already lives in a dictionary: If yes, it's a duplicate and I'll delete it; If not, I add this filename/hashvalue into the dictionary for later:
'Get all files from directory
Dim currfile As String
For Each currfile In Directory.GetFiles("C:\MyFiles\", "File.*")
'Check if hashing already found as value, ie. duplicate
If StoreItem.ContainsValue(ReadFileMD5(currfile)) Then
'Delete duplicate
'This hashing not yet found in dictionary -> add it
Else
StoreItem.Add(currfile, ReadFileMD5(currfile))
End If
Next
Is this a good way to solve the issue of finding duplicates, or is there a better way I should know about?
Thank you.
© Stack Overflow or respective owner