Understanding Closures in C#

by Zoran Horvat

Understanding the Need for Closures

In previous article, titled Understanding Delegates and Higher-Order Functions in C# , we were discussing the concept of delegates and higher-order functions implementation in C#. That article has opened the question of closures, and this is the opportunity to explain how closures work.

We can start with the function which multiplies its argument with two.

Func<int, int> scale = x => 2 * x;

But that might not be enough. Maybe we wanted to have a function which scales its argument by some factor.

Func<int, int, int> scale = (factor, k) => factor * k;

But then, look, we might be reluctant to include the factor as the argument. Because, why the function then? It’s just a multiplication. This lambda is not of much use. Let’s remove the factor from its signature:

int factor = 2;
Func<int, int> scale = (k) => factor * k;

That makes the factor an unknown value. Lambda needs a variable named factor to compile. In C#, we can let the lambda capture a variable which is accessible in the place where lambda is defined. Having a factor variable defined right above the lambda will do the job.

The factor variable in this lambda is called the free variable. It is free in sense that it was not fixed, or determined by the lambda itself. It is not present in the arguments list, nor is it defined as the local variable inside the lambda. Hence, it must come from somewhere else. To be more precise, it must come from the environment. C# is statically scoped language, which somewhat reduces the realm from which free variables can come.

Static scoping, also known as lexical scoping, means that scopes are nesting statically, and they can be viewed at compile time. Take a look at the entire source code in which lambda is defined:

namespace Demo
{
    class Program
    {
        static void Main(string[] args)
        {
            int factor = 2;
            Func<int, int> scale = x => factor * x;
        }
    }
}

In that respect, lambda function is one scope. Main function which contains its definition is the outer scope. The Program class which contains the Main function is its own outer scope. And global scope, inside of which the Program class is defined is the next outer level of scope nesting.

Therefore, the compiler will seek the factor variable inside lambda first, which includes its arguments. If it doesn’t find it, it would search the containing scope, the Main function’s body, where it will visit local variables and arguments. This time it will be lucky, because factor variable is declared there as the local variable. But if it weren’t there, the compiler would visit the next outer scope, the Program class, and look into its static data members, since Main is the static method and it can only access static class data. And so on. If none of the scopes defined a variable named factor, we would face a compile-time error.

Quite contrary to this, dynamic scoping would do none of these steps at compile time. Variable definition will only be resolved at run time, when lambda is invoked. If a variable is not defined in the lambda itself, then we have to search the next outer scope. But it’s not going to be the outer block of code, but rather the outer function from inside of which the lambda was invoked. As you can see, there is no way for the compiler to know all the places from which a function is going to be invoked at run time, and therefore, compiler will not even attempt to resolve its free variables.

For better or worse, it’s all static in C#. With static, or lexical scoping, C# compiler will conclude that the factor variable is the one declared right above, within the same enclosing scope which contains the lambda definition. And then, the adventure can begin. The factor has value two, hence, the scale function will return twice the argument’s value.

Invoking the Closure

We can then introduce another function, which would be eager to scale some values it knows about, but, alas, it doesn’t care to know the scaling factor.

static void Work(Func<int, int> scale)
{
    int y = scale(5);
    Console.WriteLine(y);
}

static void Main(string[] args)
{
    int factor = 2;
    Func<int, int> scale = x => factor * x;

    Work(scale);
    Console.ReadLine();
}

The Work function will ask for a Func delegate which scales integer values without knowing the factor. The environment within which the scale function executes, and from which the function picks concrete factor value, is called closure. Closure has been constructed at the place where lambda was constructed - down in the Main method. That means that the factor variable has been captured, and its value copied into the closure, in that very line where entire lambda was fully defined. That is very important to know.

When the Work function is invoked, and scale lambda passed to it as the argument, then Work will print out value 10, because the scaling factor’s value 2 was captured before. But problems are lurking on the horizon. What if I changed value of the factor variable just before calling the Work function? Look at the modified code:

static void Work(Func<int, int> scale)
{
    int y = scale(5);
    Console.WriteLine(y);
}

static void Main(string[] args)
{
    int factor = 2;
    Func<int, int> scale = x => factor * x;

    factor = 3; // Added instruction
    Work(scale);
    Console.ReadLine();
}

What do you think the Work function will print this time? Will it remain 10, because factor value two was captured when lambda was constructed? Or will the scale function somehow refer to the variable itself, rather than copy its value and keep it fixed? In the prior case, Work function would still produce output 10. In the latter, the new scaling factor value would take place and output will read 15.

When you run this piece of code, it will print value 15. Now it will be interesting to see why it is so. While trying to figure why the output has changed, we will reach better understanding of how closures work in C#.

Understanding How Closures Operate

What we have at this moment is an augmented delegate – a delegate with its enclosing environment, all together known as the closure. When I passed the scale function to the Work function, I haven’t just passed a delegate. I have passed a closure. And in that closure, we find the factor variable, too. However, when I try to explain that the variable is passed together with the delegate, I feel troubles trying to find the right words for that. You will see what troubles I have if you take note that the factor variable is a plain integer. You cannot just pass an integer variable around. Its value will be copied every time you wish to move it. You cannot even pass a variable of a reference type around, for that matter. You can only make a copy of the reference and make the new reference refer to the same object on the heap.

The variable factor wouldn’t go around, you see, and that is the problem when working with closures. Then how did the Work function know that the factor value has changed from 2 to 3 after it was captured by the lambda? That is the mystery, and its resolution will ask you to think outside the box.

The trick is that after capturing a variable in a closure, there will be no more factor variable. It’s not going to be part of the Main function when Main is compiled. Here is how that goes. This segment where the factor is declared, and subsequent lambda declaration, will become the closure.

// Closure:
int factor = 2;
Func<int, int> scale = x => factor * x;

I will implement my own custom closure class, just to show you what will happen when compiler encounters a lambda with free variables.

class ScaleClosure
{
    public int environment;
    public int Scale(int arg) => this.environment * arg;
}

Closure consists of environment and the function. Environment is the public field of integer type. That will be the scaling factor I need to capture. And the other part is the function. I have retyped the lambda expression, only this time referring to the local environment in place of the scaling factor.

And then, down below, where I am creating the factor and the lambda, I will instantiate closure instead.

class ScaleClosure
{
    public int environment;
    public int Scale(int arg) => this.environment * arg;
}

static void Work(ScaleClosure scale)
{
    int y = scale.Scale(5);
    Console.WriteLine(y);
}

static void Main(string[] args)
{
    var scale = new ScaleClosure() { environment = 2 };

    scale.environment = 3;
    Work(scale);
    Console.ReadLine();
}

There are a couple of new things going on here. The real closure has just become an object of the specialized closure class. There is nothing static in the ScaleClosure class. I have just pretended that I am a compiler, and I have manually coded a specialized closure class for this case. You see, there is no notion of a general closure. Every concrete closure is always tailored to the function it wraps, and that is something compiler takes care of. And its function is also very specific - in my example, the function receives integer and returns integer. Environment is also specific - in my case that is just a single integer, which is used as the scaling factor inside the function. Closure will access its environment and pick the values for the free variables, that is why it had to be instantiated as an object before use.

This sequence of steps is very close to what compiler will do for us when it encounters a closure in code. Func delegate will be instantiated in its special form, a closure, which still acts as the delegate in C#. I cannot mimic that here, because that would ask for deriving the ScaleClosure from the Func delegate class, but Func is a sealed class. Only compiler can produce specialized delegates in C# and it does that in a way that is not quite visible to us. That is why I am asking for the ScaleClosure object in the Work function. In real example, that would still remain Func<int, int> delegate. Nevertheless, that doesn’t affect my demonstration much, as it remains functionally correct.

Critical part of this demonstration is capturing the factor variable. Look at the Main method. There is no factor variable there anymore. Scaling factor is now captured in the closure, and captured in its entirety. There will be no value copying ever again. When closure is initialized, the environment will be initialized as well (to value 2, which was used as the initial scaling factor in prior example). And then, just before calling the Work function, I used to modify the factor variable. This time, this variable resides inside the closure object. That is the place for me to write the new value.

By acting as a compiler, I have just completed rewriting entire code segment and by doing that I have introduced an explicit closure for my function with a free variable. I had to do that, because I had no other way to pass the variable together with the function. And by saying to pass the variable, I really mean to pass-the-variable, with no vagueness in what I’m telling.

The only instance of the free variable is the one inside the closure object. That is the magic C# compiler does when it encounters a lambda which captures surrounding variables. And that is what makes lambdas work like a charm in C#. Everything will be right where you wanted it to be when you invoke a Func delegate.

Just to make sure that everything is done right, you can run this code and you will see that it prints value 15 on the output, just as expected.


If you wish to learn more, please watch my latest video courses

About

Zoran Horvat

Zoran Horvat is the Principal Consultant at Coding Helmet, speaker and author of 100+ articles, and independent trainer on .NET technology stack. He can often be found speaking at conferences and user groups, promoting object-oriented and functional development style and clean coding practices and techniques that improve longevity of complex business applications.

  1. Pluralsight
  2. Udemy
  3. Twitter
  4. YouTube
  5. LinkedIn
  6. GitHub