How to Override Equals and GetHashCode Methods in Base and Derived Classes

by Zoran Horvat

Introduction

Equals method is intended to return true when another object is supplied which is semantically equal to current instance. GetHashCode method is intended to return an integer value which can be used as a hash code, i.e. key that accompanies the object when object is stored in a hashed data structure. These two methods are connected – whenever Equals returns true for two objects, their GetHashCode methods must return the same value. The opposite does not have to be true.

To understand this concept, consider the way in which hash tables operate. Each object is first "hashed", i.e. accompanied with a number calculated from its contents – as soon as contents changes, the hash code is expected to change as well. General idea behind algorithms that calculate hash codes is that similar objects produce quite dissimilar hash codes, but that theory is beyond scope of this text and will not be covered further. What matters here is to understand that number of possible hash codes is quite limited when compared to number of possible objects. Therefore, hash codes associated with some objects will occasionally collide and that is where hash table comes into play to resolve the collisions – when two objects return the same hash code, then Equals method is called for one of them, supplying another one as the argument. If Equals method return true the two objects are considered equal. Otherwise, the fact that they have the same hash code is considered an unfortunate coincidence and further ignored.

Now the second important concept to understand is which data from an object take part in Equals and GetHashCode implementations. There is rarely a case in which all members of the object are important when testing equality. Such examples are geometric points – all coordinates in a 3D point are important when testing whether two points are equal. Opposite cases, where some members of the object are not important when testing whether the object is equal to another object, are more frequent. For example, information about a registered user may be quite an impressive collection of addresses, phone numbers, favorite colors and movies, etc. But the only information that really matters is the username. Two objects that refer the same username can be considered equal, no matter what other data were stored into one or the other – we are free to design some Merge method on such a class so that resulting object contains all data from both of the subdued objects, but that still doesn't change the simple fact that two objects with same username are referring to the same actual registered user.

Default Implementation

Default implementation of Equals method for reference types simply compares object references, rather than their contents. Conversely, default implementation of GetHashCode makes no guarantees that any two instances that return true when compared using their Equal methods will return the same hash code. This means that default implementations of Equals and GetHashCode are not appropriate to be used in actual hashing scenarios (e.g. storing instances in the hash table or dictionary).

When there is a need to collect objects of a class into a hash table, class is required to provide custom implementations for both Equals and GetHashCode methods. The issue grows even larger when derived classes come into play. In this article we are going to discuss techniques that can be applied to implement equality and hash code functionalities in base and derived classes so that they can both be safely used in equality tests and in collections that rely on hash codes.

Custom Implementation Goals

When reference types have semantic meaning, their implementation should include overrides for GetHashCode and Equals methods. Even more, operators == and != should also be overridden when custom Equals method is supplied, simply because default implementations of these operators for reference types work the same as default Equals implementation – they compare references, not contents. Custom implementation provides the following benefits:

  • Correct semantic comparison between objects that are the same even if some of their non-identifying aspects differ.
  • Guarantee that equal objects return same hash codes and different objects will most likely return different hash codes.
  • Guarantee that hash code for an object is the same when code is run on different versions of .NET Framework – this guarantee is not made by .NET Framework itself. However, to really have this guarantee, custom code must not rely on GetHashCode implementations of contained objects (e.g. custom hash code must not include result returned by String.GetHashCode method called on a contained string object).
  • Consistent comparison behavior on Equals method and == and != operators.
  • Correct hash code and equality functionality inherited and extended in derived types.

Custom Implementation Details

In this section we will consider practices to apply when overriding Equals and GetHashCode methods in derivable classes. These are general rules for Equals method, some of them basically implementing equality relation as defined in mathematics:

  • Reflexivity: x.Equals(x) returns true.
  • Symmetry: x.Equals(y) returns the same result as y.Equals(x).
  • Transitivity: If x.Equals(y) returns true and y.Equals(z) returns true, then x.Equals(z) must return true.
  • Inequality to null: x.Equals(null) returns false.
  • Equality and inequality operators: Operator x == y returns the same value as x.Equals(y); operator x != y returns opposite value from x.Equals(y).
  • Derived classes: x.Equals(y) takes into account whether y is instance of a class derived from x and whether derived class adds fields that affect equality comparison.

The last sentence often causes headaches. To understand its implications, we must dig deeper into the problem. Consider one simple class hieararchy:

Classes

Base class PriceTag exposes property Price, which is used to decide on object equality – two instances of PriceTag class are considered equal if and only if their Price properties return the same value. Derived class CurrencyPriceTag adds another property which is important: Currency. Two CurrencyPriceTag instances are equal if and only if both inherited Price and added Currency properties return the same values. Now, testing Price property equality is not something that derived class should bother with – that is what PriceTag.Equals method does. So equality comparison between two CurrencyPriceTag objects boils down to ensure that base class's Equals method returns true (meaning that inherited parts of objects are equal, whatever it means) and that Currency properties are equal.

But now comes the heavy part. What should be the result of equality comparison between PriceTag and CurrencyPriceTag instances? PriceTag.Equals method accepts System.Object, so we are free to submit it an instance of CurrencyPriceTag derived class. Consider the following piece of code:

PriceTag pt = new PriceTag();
CurrencyPriceTag cpt = new CurrencyPriceTag();

pt.Price = 19.2M;

cpt.Price = 19.2M;
cpt.Currency = "GBP";

bool equal1 = pt.Equals(cpt);
bool equal2 = cpt.Equals(pt);

The two objects really do have the same Price property values, so base class implementation of Equals will return true. The first equality test would thus return true. However, second equality test, in which derived class tests the equality, is in trouble: object passed as an argument doesn't even have the Currency property! Consequently, second equality test fails and returns false. In a blink of an eye, we have broken the symmetry principle which must be obeyed by any implementation of Equals that is incarnation of the mathematical equality relation. The way out of this problem is to make both methods return false. It is obvious that PriceTag and CurrencyPriceTag objects cannot be equal because PriceTag doesn't have a notion of the currency and consequently cannot possibly be equal to an object which does track currency.

But let's stir the water a little more by adding another derived class: call it DiscountedPriceTag. This class will add a property PrevPrice, which gets or sets price before the discount was applied (this price will hopefully be larger than Price returned by the base class implementation). Anyway, PrevPrice plays no role in equality comparison (neither does it take part in hash code calculation) – we will still consider two discounted prices equal as long as their final prices, which are maintained by the base class, are equal. Here is the new class hierarchy:

Discount classes

Now consider comparison between instances of PriceTag and DiscountedPriceTag objects. Since derived class adds nothing to the list of important members, these objects will be considered equal as long as their Price properties are equal. This conclusion is opposite to the one made when CurrencyPriceTag objects were compared against PriceTag objects. Obvious cause for change in comparison logic is the fact that DiscountedPriceTag derived class has added nothing of importance to the base class. But there is a tweak in this logic: PriceTag is not the one in charge to say that nothing was added to equality test. It is derived type's job to know whether it has enlarged the equality test or kept it the same as it was in the base class!

This leads to the following problem: what should be the answer when CurrencyPriceTag instance is compared against DiscountedPriceTag instance? Well, the straight up answer is: We don’t know. The result is unknown because these two classes do not derive from each other and consequently do not know about each other from either side. When comparing base class instance and derived class instance we have at least one party which is completely informed about fields to compare – that is the derived class. When comparing siblings, they do not have any communication among themselves. Whether one or both of them have added important fields that must be tested as part of equality test, that is the information not known to either of them. The net result is that Equals method should return false when objects compared do not derive from each other. Even if neither of them has changed the comparison logic, there is no simple way for them to come with such a conclusion.

Final Solution

Main problem that has been identified in the previous section is related to derived classes. Each derived class may or may not add significant properties that it takes into account when comparing with other objects for equality. These are typical calling scenarios for the Equals method:

  • Base object's Equals called with base object passed as parameter – Since both the called object and the argument are of the same class, there can be no misunderstanding. Equals method can be executed and equality result returned.
  • Base object's Equals called with derived object passed as parameter – Base object does not know whether derived object has added significant properties to its definition. Comparison cannot be performed and method should return false, indicating that objects are different by belonging to different classes.
  • Derived object's Equals called with base object passed as parameter – This is opposite to the previous case. Derived class actually knows whether it has added significant properties that were not part of the base class definition and, technically, it could perform the comparison in cases when all significant properties are part of both classes. However, since symmetry principle must be preserved (x.Equals(y) must return the same value as y.Equals(x)), and base class would simply return false, derived class must return false as well.
  • Derived object's Equals called with sibling object passed as parameter – The two classes have a common ancestor, but do not have any indication whether one or both have added significant properties that affect equality testing. The only way out of the problem is for both of them to return false when their Equals methods are called.

After this comprehensive analysis, we come to a very simple conclusion. The only safe equality comparison is the one between objects of the same class. None other is acceptable for the Equals method. Attempting to discover contents of derived object from a base object is technically possible, but too complex to be useful. Instead, in class hierarchies where derived classes may affect equality tests, it might be better to implement IComparable interface and CompareTo method and then let it arrange compared objects best possible before reading their contents.

Below is the final implementation for our three-class hierarchy.

Final classes

In this solution, Equals methods require that both the called object and the argument passed be of exactly the same type. However, testing equality on derived classes requires a call to be made to base implementation. In order to avoid base implementation simply returning false, due to type mismatch, we are adding a somewhat less stringent equality testing implementation embodied in EqualsDerived virtual protected method. This method allows testing between base and derived class. The source code says more than words, so here is the full implementation of all three classes:

using System;

namespace EqualsDemo
{

    public class PriceTag
    {

        public PriceTag() { }

        public PriceTag(decimal price) { Price = price; }

        public decimal Price
        {
            get
            {
                return _price;
            }
            set
            {
                if (value >= 0)
                    _price = value;
            }
        }

        public override int GetHashCode()
        {
            return _price.GetHashCode();
        }

        public override bool Equals(object obj)
        {
            return EqualsDerived(obj) && obj.GetType() == typeof(PriceTag);
        }

        protected virtual bool EqualsDerived(object obj)
        {
            return !object.ReferenceEquals(obj, null) &&
                        obj is PriceTag &&
                        ((PriceTag)obj)._price == this._price;
        }

        public static bool operator ==(PriceTag pt1, PriceTag pt2)
        {

            if (object.ReferenceEquals(pt1, pt2))
                return true;

            if (!object.ReferenceEquals(pt1, null) &&
                !object.ReferenceEquals(pt2, null) &&
                pt1.Equals(pt2))
                return true;

            return false;

        }

        public static bool operator !=(PriceTag pt1, PriceTag pt2)
        {
            return !(pt1 == pt2);
        }

        public override string ToString()
        {
            return string.Format("PriceTag {0}", _price);
        }

        private decimal _price;

    }

    public class CurrencyPriceTag: PriceTag
    {

        public CurrencyPriceTag() { }

        public CurrencyPriceTag(decimal price, string currency)
            : base(price)
        {
            Currency = currency;
        }

        public CurrencyPriceTag(PriceTag pt)
            : base(pt.Price)
        {
        }

        public string Currency
        {
            get
            {
                return _currency;
            }
            set
            {
                if (!string.IsNullOrEmpty(value))
                    _currency = value;
            }
        }

        public override int GetHashCode()
        {
            return base.GetHashCode() ^ _currency.GetHashCode();
        }

        public override bool Equals(object obj)
        {
            return EqualsDerived(obj) && obj.GetType() == typeof(CurrencyPriceTag);
        }

        protected override bool EqualsDerived(object obj)
        {
            return base.EqualsDerived(obj) &&
                        !object.ReferenceEquals(obj, null) &&
                        obj is CurrencyPriceTag &&
                        ((CurrencyPriceTag)obj)._currency == this._currency;
        }

        public override string ToString()
        {
            return string.Format("CurrencyPriceTag {0} {1}", Price, Currency);
        }

        public static bool operator ==(CurrencyPriceTag cp1, CurrencyPriceTag cp2)
        {

            if (object.ReferenceEquals(cp1, cp2))
                return true;

            if (!object.ReferenceEquals(cp1, null) &&
                !object.ReferenceEquals(cp2, null) &&
                cp1.Equals(cp2))
                return true;

            return false;

        }

        public static bool operator !=(CurrencyPriceTag cp1, CurrencyPriceTag cp2)
        {
            return !(cp1 == cp2);
        }

        private string _currency = "GBP";

    }

    public class DiscountedPriceTag : PriceTag
    {

        public DiscountedPriceTag() { }

        public DiscountedPriceTag(decimal price, decimal prevPrice)
            : base(price)
        {
            PrevPrice = prevPrice;
        }

        public decimal PrevPrice
        {
            get
            {
                return _prevPrice;
            }
            set
            {
                if (value >= 0)
                    _prevPrice = value;
            }
        }

        public override bool Equals(object obj)
        {
            return EqualsDerived(obj) && obj.GetType() == typeof(DiscountedPriceTag);
        }

        public override int GetHashCode()
        {
            return base.GetHashCode();
        }

        public static bool operator ==(DiscountedPriceTag dp1, DiscountedPriceTag dp2)
        {

            if (object.ReferenceEquals(dp1, dp2))
                return true;

            if (!object.ReferenceEquals(dp1, null) &&
                !object.ReferenceEquals(dp2, null) &&
                dp1.Equals(dp2))
                return true;

            return false;

        }

        public static bool operator !=(DiscountedPriceTag dp1, DiscountedPriceTag dp2)
        {
            return !(dp1 == dp2);
        }

        public override string ToString()
        {
            return string.Format("DiscountedPriceTag {0}", Price);
        }

        private decimal _prevPrice;

    }

    public class Program
    {

        static void Main(string[] args)
        {

            PriceTag[] pt = new PriceTag[]
            {
                new PriceTag(17.6M),
                new PriceTag(17.7M),
                new PriceTag(17.6M),
                new CurrencyPriceTag(17.6M, "USD"),
                new CurrencyPriceTag(17.7M, "USD"),
                new CurrencyPriceTag(17.6M, "GBP"),
                new CurrencyPriceTag(17.6M, "USD"),
                new DiscountedPriceTag(17.6M, 9.2M),
                new DiscountedPriceTag(17.7M, 9.2M),
                new DiscountedPriceTag(17.6M, 9.2M)
            };

            System.Collections.Generic.HashSet<PriceTag> hash =
                new System.Collections.Generic.HashSet<PriceTag>();

            for (int i = 0; i < pt.Length; i++)
            {
                Console.Write("{0,30}", pt[i]);
                if (hash.Contains(pt[i]))
                {
                    Console.WriteLine(" DUPLICATE");
                }
                else
                {
                    Console.WriteLine();
                    hash.Add(pt[i]);
                }
            }

            Console.WriteLine();
            Console.Write("Press ENTER to continue... ");
            Console.ReadLine();

        }
    }
}

Observe closely the way in which Equals and EqualsDerived methods have been implemented. With this solution, only objects of the same class can ever return true. Here is the output produced by this code:

            
             PriceTag 17.6
             PriceTag 17.7
             PriceTag 17.6 DUPLICATE
     CurrencyPriceTag 17.6 USD
     CurrencyPriceTag 17.7 USD
     CurrencyPriceTag 17.6 GBP
     CurrencyPriceTag 17.6 USD DUPLICATE
   DiscountedPriceTag 17.6
   DiscountedPriceTag 17.7
   DiscountedPriceTag 17.6 DUPLICATE

Press ENTER to continue...
                
    

Conclusion

In this article we have discussed issues that arise when class with custom implementation of GetHashCode and Equals methods is being extended in derived classes. Basic idea behind the solution is to keep it simple, and stick with mathematical principles that define equality relation. Since derived classes incur uncertainty about their actual content, base objects should be strictly forbidden to make claims regarding their equality to derived objects. And due to the symmetry principle of equality relation, this must work the same in the opposite direction – derived objects must not make claims regarding equality to base objects, even when they can perform the comparison.


If you wish to learn more, please watch my latest video courses

About

Zoran Horvat

Zoran Horvat is the Principal Consultant at Coding Helmet, speaker and author of 100+ articles, and independent trainer on .NET technology stack. He can often be found speaking at conferences and user groups, promoting object-oriented and functional development style and clean coding practices and techniques that improve longevity of complex business applications.

  1. Pluralsight
  2. Udemy
  3. Twitter
  4. YouTube
  5. LinkedIn
  6. GitHub