In a previous article I wrote, I introduced a way to compare expressions on the basis of their makeup as opposed to simple reference equality checks. This functionality was provided in the form of an IEqualityComparer<T> implementation, which itself was supported by several components.
Like all equality comparers, our expression equality comparer must be able to perform two functions:
- Determine the equality between two given expression instances
- Generate a hash code for a given expression instance
The previous article covered the first of these jobs; in this article, we’ll be covering the hash code generation aspect of the comparer.
Revisiting Our Equality Comparer
Just to make sure we’re all on board in regards to how and where our hash code generator will be used, let’s take a second to revisit the equality comparer presented in the previous article.
ExpressionEqualityComparer.cs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | /// <summary> /// Provides methods to compare <see cref="Expression"/> objects for equality. /// </summary> public sealed class ExpressionEqualityComparer : IEqualityComparer< Expression > { private static readonly ExpressionEqualityComparer _Instance = new ExpressionEqualityComparer(); /// <summary> /// Gets the default <see cref="ExpressionEqualityComparer"/> instance. /// </summary> public static ExpressionEqualityComparer Instance { get { return _Instance; } } /// <inheritdoc/> public bool Equals( Expression x, Expression y) { return new ExpressionComparison(x, y).ExpressionsAreEqual; } /// <inheritdoc/> public int GetHashCode( Expression obj) { return new ExpressionHashCodeCalculator(obj).Output; } } |
As we can see above, the brunt of the equality comparison work is performed by the ExpressionComparison class, which is a type that we covered in the first article in this series.
If you look at the code for the ExpressionComparison type, you’ll see that it is a derivative of the .NET provided ExpressionVisitor class. The reason why ExpressionComparison was subclassed from ExpressionVisitor is because hooking into that infrastructure is logically congruent with the structure of the expressions it would be comparing.
In order to take into account all the various types of expressions and their properties, we needed to override the majority of the virtual (Visit[x]) methods exposed by ExpressionVisitor. We did not have to override all of them, only the ones targeting expression types whose makeup was unique among all other types.
Just like we did with ExpressionComparison, our ExpressionHashCodeCalculator will also subclass ExpressionVisitor, and it will behave in much the same way, except it will be calculate a running total of the hash codes of all the various properties significant to the given type of expression.
Expression Hash Code Calculator
Now, let’s get into the meat of the matter. As usual, I need to state a disclaimer regarding the usage of the code before we go over it. Although I’ve personally tested all part of the code, I would encourage you to do so yourself before just plopping into what you’re doing.
This code has received a fair amount of scrutiny and is used in some important parts of a larger project I’ve been authoring. In fact, the hash code generation aspect of my comparer rests at the heart of the reasons why I started out on this endeavor (it was my wish to be able to use expressions as keys in a dictionary).
We’ll first take a look at the code for the ExpressionHashCodeCalculator type, and then discuss its merits afterwards.
ExpressionHashCodeCalculator.cs
Update: Thanks to Denis in the comments for pointing out that there was a lack of support for constant collections; code has now been updated to support constant collections.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 | /// <summary> /// Provides a visitor that calculates a hash code for an entire expression tree. /// This class cannot be inherited. /// </summary> public sealed class ExpressionHashCodeCalculator : ExpressionVisitor { /// <summary> /// Initializes a new instance of the <see cref="ExpressionHashCodeCalculator"/> class. /// </summary> /// <param name="expression">The expression tree to walk when calculating the has code.</param> public ExpressionHashCodeCalculator( Expression expression) { Visit(expression); } /// <summary> /// Gets the calculated hash code for the expression tree. /// </summary> public int Output { get ; private set ; } /// <summary> /// Calculates the hash code for the common <see cref="Expression"/> properties offered by the provided /// node before dispatching it to more specialized visit methods for further calculations. /// </summary> /// <inheritdoc/> public override Expression Visit( Expression node) { if ( null == node) return null ; Output += node.GetHashCode(node.NodeType, node. Type ); return base .Visit(node); } /// <inheritdoc/> protected override Expression VisitBinary(BinaryExpression node) { Output += node.GetHashCode(node.Method, node.IsLifted, node.IsLiftedToNull); return base .VisitBinary(node); } /// <inheritdoc/> protected override CatchBlock VisitCatchBlock(CatchBlock node) { Output += node.GetHashCode(node.Test); return base .VisitCatchBlock(node); } /// <inheritdoc/> protected override Expression VisitConstant(ConstantExpression node) { IEnumerable nodeSequence = node.Value as IEnumerable ; if ( null == nodeSequence) Output += node.GetHashCode(node.Value); else { foreach ( object item in nodeSequence) { Output += node.GetHashCode(item); } } return base .VisitConstant(node); } /// <inheritdoc/> protected override Expression VisitDebugInfo(DebugInfoExpression node) { Output += node.GetHashCode(node.Document, node.EndColumn, node.EndLine, node.IsClear, node.StartColumn, node.StartLine); return base .VisitDebugInfo(node); } /// <inheritdoc/> protected override Expression VisitDynamic(DynamicExpression node) { Output += node.GetHashCode(node.Binder, node.DelegateType); return base .VisitDynamic(node); } /// <inheritdoc/> protected override ElementInit VisitElementInit(ElementInit node) { Output += node.GetHashCode(node.AddMethod); return base .VisitElementInit(node); } /// <inheritdoc/> protected override Expression VisitGoto(GotoExpression node) { Output += node.GetHashCode(node.Kind); return base .VisitGoto(node); } /// <inheritdoc/> protected override Expression VisitIndex(IndexExpression node) { Output += node.GetHashCode(node.Indexer); return base .VisitIndex(node); } /// <inheritdoc/> protected override LabelTarget VisitLabelTarget(LabelTarget node) { Output += node.GetHashCode(node.Name, node. Type ); return base .VisitLabelTarget(node); } /// <inheritdoc/> protected override Expression VisitLambda<T>( Expression <T> node) { Output += node.GetHashCode(node.Name, node.ReturnType, node.TailCall); return base .VisitLambda(node); } /// <inheritdoc/> protected override Expression VisitMember( MemberExpression node) { Output += node.GetHashCode(node.Member); return base .VisitMember(node); } /// <inheritdoc/> protected override MemberBinding VisitMemberBinding(MemberBinding node) { Output += node.GetHashCode(node.BindingType, node.Member); return base .VisitMemberBinding(node); } /// <inheritdoc/> protected override Expression VisitMethodCall(MethodCallExpression node) { Output += node.GetHashCode(node.Method); return base .VisitMethodCall(node); } /// <inheritdoc/> protected override Expression VisitNew(NewExpression node) { Output += node.GetHashCode(node.Constructor); return base .VisitNew(node); } /// <inheritdoc/> protected override Expression VisitParameter(ParameterExpression node) { Output += node.GetHashCode(node.IsByRef); return base .VisitParameter(node); } /// <inheritdoc/> protected override Expression VisitSwitch(SwitchExpression node) { Output += node.GetHashCode(node.Comparison); return base .VisitSwitch(node); } /// <inheritdoc/> protected override Expression VisitTypeBinary(TypeBinaryExpression node) { Output += node.GetHashCode(node.TypeOperand); return base .VisitTypeBinary(node); } /// <inheritdoc/> protected override Expression VisitUnary(UnaryExpression node) { Output += node.GetHashCode(node.IsLifted, node.IsLiftedToNull, node.Method); return base .VisitUnary(node); } } |
As you can see from the above code, the general method for calculating the hash code of a given expression is by visiting all its manifestations and parts and then generating hash codes from the properties significant to those parts.
The GetHashCode methods being invoked in the code are not native to the types they are being invoked on. Rather, they are extension methods, and are something I talked about in another article I wrote.
Each part of the calculate tacks on the total value of the various parts to an Output property, which holds the running total for our hash code. Calculation will end upon our Visit override being executed with a null node being passed.
There are many virtual Visit[x] methods offered by the ExpressionVisitor type. As I stated in the first article of this series, there were in general a large number of new types of expressions added to the .NET framework with 4.0.
When creating our ExpressionComparison class, we overrode many of these methods, but only the ones that held some bearing on the actual shape of the expression as far as an equality check was concerned. The same applies to our hash code calculator; it visits many of the same parts as was visited by the equality comparer, with some differences, namely, some overrides found in ExpressionComparison are not found in ExpressionHashCodeCalculator, and vice versa.
The reasons for not overriding a particular virtual method typically boil down to a lack of properties offered from which to grab hash codes that wouldn’t be offered in another, more centrally invoked override.
For example, one virtual method not overridden is the VisitBlock, method. This method accepts a single BlockExpression typed parameter. If we look at the BlockExpression type, we’ll notice that it offers no additional properties unique to what’s offered by more base Expression types. Well…at least except for the Result property. But even the presence of this property is not cause enough for us to override the method, the reason being that the Result property itself (which is an Expression) is visited by the base VisitBlock implementation, and therefore would be end up being visited by another block of our code anyways.
There are a few more methods not included, but I’ll leave them as an exercise for the reader.
If anyone finds any types of expressions that the above code does not account for, I’d appreciate your input. When constructing this code, however, I tried to be fairly exhaustive in my efforts.