How to Implement Lazy Default-If-Empty Functionality on Collections

by Zoran Horvat

Working with collections comes with some special cases. One of them is the empty collection, which sometimes has to be dealt with separately. Most importantly, some aggregate functions will fail on empty collections. For example, summation will work fine on an empty collection – the sum of no values will be zero – but maximum, minimum, average functions will fail on an empty collection.

Take a look at implementation of classes that represent money transactions and money account:

class Transaction
{

    public decimal Amount { get; private set; }

    public Transaction(decimal amount)
    {
        this.Amount = amount;
    }

}

class Account
{
    private IList<Transaction> transactions = new List<Transaction>();

    public void Deposit(decimal amount)
    {
        this.transactions.Add(new Transaction(amount));
    }

    public void Withdraw(decimal amount)
    {
        this.transactions.Add(new Transaction(-amount));
    }

    public decimal Balance
    {
        get
        {
            return
                this.transactions
                .Sum(trans => trans.Amount);
        }
    }

    public decimal AverageDeposit
    {
        get
        {
            return
                this.transactions
                .Where(trans => trans.Amount > 0)
                .Average(trans => trans.Amount);
        }
    }
}

Balance property will work fine even if there are no transactions. But AverageDeposit property will fail in the same case, because calculating average over an empty collection would yield zero divided by zero.

There are similar examples from one of the previous articles where we have seen how a collection with zero or one elements can be used to represent optional objects (see Option<T> Functional Type ). Optional object then maps to some kind of result, but that produces an optional result as well. We might want to produce some default result in case the object is missing:

IPurchaseReport Purchase(string username, IProduct product)
{
    return
        this.userRepository
        .Find(username)
        .Select(user => user.Purchase(product))
        .DefaultIfEmpty(this.reportFactory.CreateNotRegistered(username))
        .Single();
}

This case is more complicated because it comes with a mandatory default value. However, this default value differs from the case that would be used in case of the failing Average method. To see the difference, we will first analyze these two cases.

Coming up with a default value that would stand in place of missing objects is not always easy. We recognize three levels of difficulties here, and these three levels will be analyzed separately.

Simple Default Value

The first case is when a simple value is sufficient to plug the empty collection:

decimal AverageDeposit
{
    get
    {
        return
            this.transactions
            .Where(trans => trans.Amount > 0)
            .Select(trans => trans.Amount)
            .DefaultIfEmpty(0)
            .Average();
    }
}

This change fixes the Average aggregate function problem. If there are no money deposits, we just claim that average deposit was zero. Hardly any user will complain to see zero average in the user interface before making the first deposit transaction.

Existing DefaultIfEmpty extension method works perfectly in cases like this. That is exactly what it was designed for.

Complex Default Value

Things become more difficult if default value is an object. Creating a new object comes with certain price which, although not being great, still adds both time and memory pressure.

IPurchaseReport Purchase(string username, IProduct product)
{
    return
        this.userRepository
        .Find(username)
        .Select(user => user.Purchase(product))
        .DefaultIfEmpty(this.reportFactory.CreateNotRegistered(username))
        .Single();
}

This implementation of the Purchase method has to produce certain purchase report. User repository returns a collection with zero or one User objects in it. If proper User object has been obtained, it is mapped to a purchase report by calling its Purchase method. However, if user was not found, the method constructs special purchase report which reads “user not registered”. This is the application of the Special Case design pattern .

But this case is more complicated than the previous one with averaging money transactions. Complication comes from the fact that we are still calling the report factory and asking it to create special case report, even when the result is never used. This case occurs when original collection is not empty.

Bottom line is that using DefaultIfEmpty to prepare a complex object may be too heavy, knowing that the object will only be thrown away in most of the executions.

Default Value With Side-Effects

The last case is the most complicated one, and it occurs when providing the default value comes with observable side-effects. Key point in this sentence is the word “observable”. Side effects don’t have to affect the system. All operations could be just queries that do not alter the system. But some observable change would still occur, like a log line or audit message which says that a nonexistent user tried to make a purchase.

Let’s return to the same Purchase function we had before:

IPurchaseReport Purchase(string username, IProduct product)
{
    return
        this.userRepository
        .Find(username)
        .Select(user => user.Purchase(product))
        .DefaultIfEmpty(
            this.reportFactory.CreateNotRegistered(username)) // writes to log
        .Single();
}

In this case, using DefaultIfEmpty is clearly wrong. A note will appear in log every time this function is executed, be it a registered user or not. One of the issues with this solution is that reportFactory field probably implements some repository interface and we don’t even know whether the actual implementation produces some side effects or not.

Avoiding Side Effects from DefaultIfEmpty

From previous examples we conclude that any use of DefaultIfEmtpy other than the one that provides simple, fixed values, may cause a defect in code.

One naïve possibility is to rely on the Lazy<T> type. Lazy objects receive lambda through their constructor, and this lambda will be effectively executed only in case that the object’s Value property getter is invoked.

IPurchaseReport Purchase(string username, IProduct product)
{
    return
        this.userRepository
        .Find(username)
        .Select(user =>
                new Lazy<IPurchaseReport>(
                    () => user.Purchase(product)))
        .DefaultIfEmpty(
                new Lazy<IPurchaseReport>(
                    () => this.reportFactory.CreateNotRegistered(username)))
        .Select(lazy => lazy.Value)
        .Single();
}

In this solution, both positive and negative case ends in creating a lazy object. The key point is that lambdas passed to lazy objects are not executed until it is made certain which one of them should be executed.

Final Select method causes one and only one lambda to execute. Therefore, if any of the lambdas has undesired side-effects, these effects will not occur when wrapped in lazy objects.

This solution, however cumbersome it may look, will work perfectly fine in all practical cases. Default value will be produced only if incoming collection is really empty. Otherwise, the default object will not be created at all.

Providing Custom Lazy DefaultIfEmpty Method

Previous solution, however correct, is too complicated for practical use. It adds many syntactical constructs that are there only to employ the infrastructure. Code is hard to read and to understand.

Things would be much easier if we had a variant of DefaultIfEmpty which receives a lambda and comes with guarantee that this lambda will not be invoked unless the incoming collection is actually empty.

using System;
using System.Collections.Generic;

namespace NullReferencesDemo.Common
{
    public static class IEnumerableExtensions
    {
        public static IEnumerable<T> LazyDefaultIfEmpty<T>(this IEnumerable<T> source,
                                                           Func<T> defaultFactory)
        {
            bool isEmpty = true;

            foreach (T value in source)
            {
                yield return value;
                isEmpty = false;
            }

            if (isEmpty)
                yield return defaultFactory();

        }
    }
}

This is the extension method which does exactly what was said before. It returns a default value produced by the factory method if the source collection is empty. But it does not invoke the factory method unless necessary.

Equipped with this custom extension method, we can finally write code which is both readable and correct:

IPurchaseReport Purchase(string username, IProduct product)
{
    return
        this.userRepository
        .Find(username)
        .Select(user => user.Purchase(product))
        .LazyDefaultIfEmpty(() => this.reportFactory.CreateNotRegistered(username))
        .Single();
}

Conclusion

Providing default value when collection is empty may be a surprisingly difficult task. DefaultIfEmpty method which ships with the LINQ to Objects library requires a value as its argument. This implies that the value will be calculated before it is decided whether it will be used or not.

Having the default value needlessly calculated may cause a defect, especially if creation of the default value carries certain side effects. These effects may come in most unexpected ways, for example as messages written to log file.

Solution to the problem is to provide custom implementation for the DefaultIfEmpty extension method, which receives a lambda instead of actual object. This lambda will then be invoked only if the collection is empty, and never if collection actually contains elements.


If you wish to learn more, please watch my latest video courses

About

Zoran Horvat

Zoran Horvat is the Principal Consultant at Coding Helmet, speaker and author of 100+ articles, and independent trainer on .NET technology stack. He can often be found speaking at conferences and user groups, promoting object-oriented and functional development style and clean coding practices and techniques that improve longevity of complex business applications.

  1. Pluralsight
  2. Udemy
  3. Twitter
  4. YouTube
  5. LinkedIn
  6. GitHub