This article is meant to serve as a reference for a particular set of functions that may be present in code snippets found in subsequent articles.

Every so often one may find themselves tasked with writing hash code generation algorithms for a particular type of object. This sort of requirement typically arises whenever we’re authoring either a value type or an implementation of an interface which requires such functionality (e.g. IEqualityComparer<T>).

While the way hash codes end up being calculated tends to differ between object types, hash code generation mechanisms can only be considered proper if the following requisites are met:

  1. If two objects are deemed equal, then the hash code generation mechanism should yield an identical value for each object.
  2. Given an instance of an object, its hash code should never change (thus using mutable objects as hash keys and actually mutating them is generally a bad idea).
  3. Although it is acceptable for the same hash code to be generated for objects instances which are not equal, the way in which the hash code is calculated should be such so that these kinds of collisions are as infrequent as possible.
  4. Calculation of the hash code should not be an expensive endeavor.
  5. The generation mechanism should never throw an exception.

Naturally, it makes sense to abstract the steps involved in satisfying such requirements into an independent function which can be used by any kind of object. Unfortunately, for the most part, it simply isn’t possible to guarantee the satisfaction of  points #1 and #2 with common code. We can, however, build something that contributes towards the success of #3.

To do this, we can create a set of functions that calculate a hash code based on the property values provided to them, with each function differing in the number of properties that they accept. A static helper class could serve as a home for these functions, but since I like to avoid static helper classes whenever possible, I thought it prudent to craft them as extension methods instead, such as the one shown below:

GetHashCode<TFirstProperty,TSecondProperty>
/// <summary>
/// Calculates the hash code for an object using two of the object's properties.
/// </summary>
/// <param name="value">The object we're creating the hash code for.</param>
/// <typeparam name="TFirstProperty">The type of the first property the hash is based on.</typeparam>
/// <typeparam name="TSecondProperty">The type of the second property the hash is based on.</typeparam>
/// <param name="firstProperty">The first property of the object to use in calculating the hash.</param>
/// <param name="secondProperty">The second property of the object to use in calculating the hash.</param>
/// <returns>
/// A hash code for <c>value</c> based on the values of the provided properties.
/// </returns>
/// ReSharper disable UnusedParameter.Global
public static int GetHashCode<TFirstProperty,TSecondProperty>([NotNull]this object value,
                                                              TFirstProperty firstProperty,
                                                              TSecondProperty secondProperty)
{
    unchecked
    {
        int hash = 17;

        if (!firstProperty.IsNull())
            hash = hash * 23 + firstProperty.GetHashCode();
        if (!secondProperty.IsNull())
            hash = hash * 23 + secondProperty.GetHashCode();

        return hash;
    }
}

Note: IsNull is another extension method that properly deals with checking if an instance is null when we don’t know whether we’re dealing with a value or reference type.

We’re using two prime numbers (17 and 23) for our seeds to aid in the effort of reducing collisions. It can be argued that they are not optimal; however, that would most likely end up being a weak argument, and such a discussion is not even in the scope of this article. The values originate from other sources out there that address the issue of collisions (e.g. stackoverflow.com). We have the code inside an unchecked block so that overflows do not result in exceptions.

Again, this could easily be placed into a helper class instead; regardless, I wanted to tap into the portability offered by extension methods. Also, given that all objects have a GetHashCode method, I don’t view the extension of the object type in this manner as harmful, even if we aren’t actually using the source object itself.

I have about six of these methods, with the number of parameters accepted ranging from one to six. All of this is neither earth shattering nor rocket science. As I stated at the beginning, I’m mainly sharing this with you, the reader, so I can refer to this if questions arise from their usage in articles subsequent to this one.

Matt Weber

I'm the the Senior Software Architect at Emergingsoft where I lead the software development team. I am also the owner of this website. I enjoy well-designed code, independent thought, and the application of rationality in general. You can reach me at matt@badecho.com.

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title="" rel=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

 
   
© 2012-2013 Matt Weber. All Rights Reserved. Terms of Use.