C# HashSet<T>

Posted by Ben Griswold on Johnny Coder See other posts from Johnny Coder or by Ben Griswold
Published on Tue, 22 Dec 2009 23:19:26 +0000 Indexed on 2010/03/18 22:11 UTC
Read the original article Hit count: 1573

Filed under:

I hadn’t done much (read: anything) with the C# generic HashSet until I recently needed to produce a distinct collection.  As it turns out, HashSet<T> was the perfect tool.

As the following snippet demonstrates, this collection type offers a lot:

  1. // Using HashSet<T>:
  2. // http://www.albahari.com/nutshell/ch07.aspx
  3. var letters = new HashSet<char>("the quick brown fox");
  4.  
  5. Console.WriteLine(letters.Contains('t')); // true
  6. Console.WriteLine(letters.Contains('j')); // false
  7.  
  8. foreach (char c in letters) Console.Write(c); // the quickbrownfx
  9. Console.WriteLine();
  10.  
  11. letters = new HashSet<char>("the quick brown fox");
  12. letters.IntersectWith("aeiou");
  13. foreach (char c in letters) Console.Write(c); // euio
  14. Console.WriteLine();
  15.  
  16. letters = new HashSet<char>("the quick brown fox");
  17. letters.ExceptWith("aeiou");
  18. foreach (char c in letters) Console.Write(c); // th qckbrwnfx
  19. Console.WriteLine();
  20.  
  21. letters = new HashSet<char>("the quick brown fox");
  22. letters.SymmetricExceptWith("the lazy brown fox");
  23. foreach (char c in letters) Console.Write(c); // quicklazy
  24. Console.WriteLine();

The MSDN documentation is a bit light on HashSet<T> documentation but if you search hard enough you can find some interesting information and benchmarks.

But back to that distinct list I needed…

  1. // MSDN Add
  2. // http://msdn.microsoft.com/en-us/library/bb353005.aspx
  3. var employeeA = new Employee {Id = 1, Name = "Employee A"};
  4. var employeeB = new Employee {Id = 2, Name = "Employee B"};
  5. var employeeC = new Employee {Id = 3, Name = "Employee C"};
  6. var employeeD = new Employee {Id = 4, Name = "Employee D"};
  7.  
  8. var naughty = new List<Employee> {employeeA};
  9. var nice = new List<Employee> {employeeB, employeeC};
  10.  
  11. var employees = new HashSet<Employee>();
  12. naughty.ForEach(x => employees.Add(x));
  13. nice.ForEach(x => employees.Add(x));
  14.  
  15. foreach (Employee e in employees) Console.WriteLine(e);
  16. // Returns Employee A Employee B Employee C

The Add Method returns true on success and, you guessed it, false if the item couldn’t be added to the collection.  I’m using the Linq ForEach syntax to add all valid items to the employees HashSet.  It works really great. 

This is just a rough sample, but you may have noticed I’m using Employee, a reference type.  Most samples demonstrate the power of the HashSet with a collection of integers which is kind of cheating.  With value types you don’t have to worry about defining your own equality members.  With reference types, you do.

  1. internal class Employee
  2. {
  3.     public int Id { get; set; }
  4.     public string Name { get; set; }
  5.  
  6.     public override string ToString()
  7.     {
  8.         return Name;
  9.     }
  10.     
  11.     public bool Equals(Employee other)
  12.     {
  13.         if (ReferenceEquals(null, other)) return false;
  14.         if (ReferenceEquals(this, other)) return true;
  15.         return other.Id == Id;
  16.     }
  17.  
  18.     public override bool Equals(object obj)
  19.     {
  20.         if (ReferenceEquals(null, obj)) return false;
  21.         if (ReferenceEquals(this, obj)) return true;
  22.         if (obj.GetType() != typeof (Employee)) return false;
  23.         return Equals((Employee) obj);
  24.     }
  25.  
  26.     public override int GetHashCode()
  27.     {
  28.         return Id;
  29.     }
  30.  
  31.     public static bool operator ==(Employee left, Employee right)
  32.     {
  33.         return Equals(left, right);
  34.     }
  35.  
  36.     public static bool operator !=(Employee left, Employee right)
  37.     {
  38.         return !Equals(left, right);
  39.     }
  40. }

Fortunately, with Resharper, it’s a snap. Click on the class name, ALT+INS and then follow with the handy dialogues.

image

That’s it. Try out the HashSet<T>. It’s good stuff.

© Johnny Coder or respective owner

Related posts about c#