Here is Why Calling a Virtual Function from the Constructor is a Bad Idea

by Zoran Horvat

In this article, we will discuss one common mistake that happens when you make a call to a virtual function from inside a constructor. You will see how that can turn into a full bug, and then you will learn what you can do to avoid the need to ever call a virtual function from inside a constructor of a class.

Reading Uninitialized State

Why that is it so bad to call a virtual function from a constructor? How can that turn into a bug, and what can we do to avoid making that mistake?

We will start from the base class, which has some state on it, initialized through the constructor. There is nothing special so far. From that class, we are deriving another class, and initializing the inherited state through its own constructor by delegating base state initialization to the base class’s constructor.

class Base
{
  public int State { get; }

  public Base(int state)
  {
    this.State = state;
  }
}

class Derived : Base
{
  public Derived(int state)
    : base(state) { }
}

Base a = new Base(3);
Base b = new Derived(4);

After this point, the example will become more complex. We might add more state to the base class, but such that we cannot just initialize it. It must be calculated.

class Base
{
  public int State { get; }
  public int Calculated { get; }

  public Base(int state)
  {
    this.Calculated = this.Calculate();
    this.State = state;
  }

  protected virtual int Calculate() => 7;
}

class Derived : Base { ... }

The base class is allowing the derived class to calculate this state differently, so the function that performs calculation is a virtual protected method. Base class is calling this method from the constructor.

That leaves it to the derived class to override the calculation with whatever it wants. And here the troubles begin, because the derived class might want to use the state of the object to calculate the value.

class Base
{
  public int State { get; }
  public int Calculated { get; }

  public Base(int state)
  {
    this.Calculated = this.Calculate();
    this.State = state;
  }

  protected virtual int Calculate() => 7;
}

class Derived : Base
{
  public Derived(int state)
    : base(state) { }

  protected override int Calculate() =>
    base.State * 2;
}

Base a = new Base(3);    // a.Calculated = 7
Base b = new Derived(4); // b.Calculated = 0

You see already where this is heading - the derived class is using the uninitialized state. It is using the state before it was initialized in the base class.

Causing Unintended Null Dereferencing

This situation can get even worse. We could add a dependency, an object of a reference type. Initialize it through the constructor in both classes.

And then the derived class wants to use the dependency as well in its calculation. It is dereferencing the dependency, but the dependency reference has not been initialized yet and it is a null at the point where the derived class is using it.

class Base
{
  private object Dependency { get; }
  public int State { get; }
  public int Calculated { get; }

  public Base(object dependency, int state)
  {
    this.Calculated = this.Calculate();
    this.Dependency = dependency;
    this.State = state;
  }

  protected virtual int Calculate() => 7;
}

class Derived : Base
{
  public Derived(object dependency, int state)
    : base(dependency, state) { }

  protected override int Calculate() =>
    base.Dependency.GetHashCode() % base.State;
}

Base a = new Base(new object(), 3);
Base b = new Derived(new object(), 4); // NullReferenceException

Even though we are passing an object – it is not a null reference, obviously – there will still be the NullReferenceException thrown, because the derived class is accessing the dependency before it was initialized in the base. And even if we survived dereferencing null, there was division by zero just around the corner waiting for us.

This approach is generally flawed, and we must do something to prevent this erroneous behavior.

Trying to Work Around the Problem

One attempt to mitigate the issue is to pass arguments to the virtual function.

class Base
{
  private object Dependency { get; }
  public int State { get; }
  public int Calculated { get; }

  public Base(object dependency, int state)
  {
    this.Calculated = this.Calculate(dependency, state);
    this.Dependency = dependency;
    this.State = state;
  }

  protected virtual int Calculate(
    object dependency, int state) => 7;
}

class Derived : Base
{
  public Derived(object dependency, int state)
    : base(dependency, state) { }

  protected override int Calculate(
    object dependency, int state) =>
    dependency.GetHashCode() % state;
}

Base a = new Base(new object(), 3);
Base b = new Derived(new object(), 4);  // OK

Even though this code will avoid the null reference pitfall, it is defeating the purpose of the object state. Why hold values both in the state and in the function arguments? - that doesn't make much sense. We will abandon this design.

The second attempt is to just call the virtual function last. In previous variants, I have intentionally made a virtual function call before setting the state. I did that with the purpose of demonstrating the dangers of mixing virtual calls with state initialization. Nevertheless, when the virtual function is called last, C# code will not fail.

class Base
{
  private object Dependency { get; }
  public int State { get; }
  public int Calculated { get; }

  public Base(object dependency, int state)
  {
    this.Dependency = dependency;
    this.State = state;
    this.Calculated = this.Calculate();  // Call last
  }

  protected virtual int Calculate() => 7;
}

class Derived : Base
{
  public Derived(object dependency, int state)
    : base(dependency, state) { }

  protected override int Calculate() =>
    base.Dependency.GetHashCode() % base.State;
}

Base a = new Base(new object(), 3);
Base b = new Derived(new object(), 4);  // OK

However, this approach is not applicable in general case, because some other part of the state might depend on the result of the virtual function call, making it impossible to move the function call to the end of the constructor’s body. There could also be two virtual functions. Which comes first if you want to call both? How do you decide the order when the order depends on the implementation in the derived class, which is not known to the base class?

Placing the virtual function call last in the constructor is a cheap workaround that will sometimes save you. It is not good in every situation. Let's just give up working around and solve the problem once and for all.

Avoiding the Need for Virtual Functions in the Constructor

From this point on, I will show you the way how you can remove a call to virtual function entirely. There will be several modifications to both the base and the derived classes, so please read the following code carefully.

class Base
{
  private object Dependency { get; }
  public int State { get; }
  public int Calculated { get; }

  public static Base Create(object dependency, int state) =>
    new(dependency, state, state + 12);

  protected Base(object dependency, int state, int calculated) =>
    (Dependency, State, Calculated) = (dependency, state, calculated);
}

class Derived : Base
{
  public Derived(object dependency, int state)
    : base(dependency, state, 
           dependency.GetHashCode() % state) { }
}

Base a = Base.Create(new object(), 3);
Base b = new Derived(new object(), 4);  // OK

We have introduced a protected constructor that receives even the calculated field – everything. The purpose of the constructor is to initialize the object, to set the entire state. And this protected constructor is the one which is doing precisely that – just setting the state.

The public constructor is now gone, replaced by the static factory function. This is the measure that prevents a call from a derived class which avoids passing the mandatory calculated state. By the same token, this static factory function is the place where the base class is doing the calculation as it is seeing it.

The final touch to the base class was to remove the virtual function. We don't need it function anymore.

The last modification is in the derived class. It is doing its own calculation using constructor arguments, so to pass the entire state to the base class’s constructor. Again, there is no virtual function. Whatever the derived class wanted to calculate, the call to the base constructor is the place to do it.

Observe that the call to the base class’s constructor is now equipped with all the arguments: the dependency, the state, and the calculated value. If complexity of the calculation got out of hand, the derived class could even call some static function to do the work. But the calculation would still be completed before the call to the protected constructor in the base class is made.

Summary

In this article, you have learned a technique which is helping us avoid defects caused by accessing an uninitialized state of the object by making a call to a virtual function from inside the constructor. We have solved the problem by design, so that there is no virtual function to call.

If you follow this alternative design, you will avoid the need to have a virtual function for the sake of varying base class’s state initialization in the constructor. That will also help you avoid any pitfalls associated with accessing the uninitialized or partially initialized base class’s state in the derived class’s function.


If you wish to learn more, please watch my latest video courses

About

Zoran Horvat

Zoran Horvat is the Principal Consultant at Coding Helmet, speaker and author of 100+ articles, and independent trainer on .NET technology stack. He can often be found speaking at conferences and user groups, promoting object-oriented and functional development style and clean coding practices and techniques that improve longevity of complex business applications.

  1. Pluralsight
  2. Udemy
  3. Twitter
  4. YouTube
  5. LinkedIn
  6. GitHub