http://www.codinghelmet.com/  

Wear a helmet. Even when coding.

howto > wrap-system-random-into-infinite-ienumerable-int-sequence

How to Wrap System.Random into an Infinite IEnumerable<int> Sequence
by Zoran Horvat @zoranh75
April 10, 2018

Generating random numbers is a common requirement. Even without security-related considerations, many algorithms we implement in day-to-day programming are randomized by their nature. For that reason, it is truly amusing to see the .NET Framework still circling around the System.Random class, as it ever did.

In this article, we will identify common issues related to use of System.Random class. Then, we will propose an alternate randomization solution which is aligned with modern C# syntax and design principles.

Understanding the Limitations of System.Random

The Random class has been around since .NET version 1.1 and it was the foundation of so many randomized implementations ever since. This class represents a pseudo-random number generator, which has to be seeded before use.

That will be the first issue to note here, because seeding mechanism which is built into the Random class is using current system time. If you do not specify your own seed value, and in most applications you won’t have a random seed readily available, then the Random class will read current system time to come up with its own seed value.

This limitation becomes a serious obstacle in the design. A common solution to the problem is to have one entity responsible to instantiate one Random object and then distribute that object to all subordinate objects which are implementing randomized operations. This design is disruptive and leaky.

The other problem appears in multi-threaded environments, where one Random instance cannot suffice. We would have to create one instance per thread to make its use safe. That brings the same problem back in the game, as threads can start in short time intervals. If each Random instance is created about the time when the thread starts, then all random sequences will be the same again in all of the threads.

Further down the stream, at the level of user-friendliness, we find that the Random class is telling the next random number via the state-mutating method Next (and a few others, but let’s stick with the simplest one of them). Generating many pseudo-random numbers, and applying some transformation to each of them, means to iterate through the loop and apply the transformation explicitly:

List<MyClass> output = new List<MyClass>();
Random rand = new Random();

for (int i = 0;  i < 1000; i++)
{
    MyClass obj = helper.Transform(rand.Next());
    output.Add(obj);
}

This approach is so outdated. Why wouldn’t we ask for an IEnumerable<int>, for example, which would be an infinite sequence of pseudo-random numbers? If we had it, the code above would boil down to a one-liner:

IEnumerable<int> randomNumbers = ...
List<MyClass> output = randomNumbers.Select(helper.Transform).ToList();

Defining the Requirements

Now that we have outlined the basic capabilities of the existing System.Random class, we can start thinking of a better random number API to supersede it.

For one thing, we can use cryptographically secure random source to generate a 32-bit random seed. This time, the seed value would not be bound to system time. Consequently, we would be able to construct several pseudo-random number generators almost simultaneously and each of them would still generate a different sequence of numbers.

The next requirement is to construct pseudo-random numbers in form of an endless IEnumerable<int> sequence. That would make it convenient for consumption through a foreach loop, or any other standard mechanism which consumes a sequence, like LINQ operators; this would oppose to forcing all consumers to bind to a strongly typed System.Random API.

With these two requirements alone, we would be able to construct a pseudo-random number sequence which is independent of any global state. The sequence would also save its consumers from having to depend on any concrete API.

Proposed Solution

In this solution, we will address two aspects discussed above. One will be to seed the generator in a way which doesn’t depend on system time. The other will be to construct a streaming API, which would expose pseudo-random numbers as a sequence IEnumerable<int>.

Seeding first. Below is the Seed class, which constructs a random 32-bit integer number using cryptographically secure random number generator.

public static class Seed
{
    public static int CreateRandom()
    {
        using (System.Security.Cryptography.RandomNumberGenerator numberGenerator =
            System.Security.Cryptography.RandomNumberGenerator.Create())
        {
            byte[] values = new byte[4];
            numberGenerator.GetBytes(values);

            int seed = 0;
            foreach (byte value in values)
                seed = (seed << 8) | value;

            return seed;
        }
    }
}

This technique is somewhat dangerous, as repeated use of crypto-API can lead to deterioration of its service. That topic is beyond scope of this article. We will only emphasize that taking four bytes to initialize a random number generator will not raise concerns regarding the computer’s safety. Taking a number of seeds in a row will still not pose a problem. But taking thousands of seeds could be a risky operation.

The second part of the solution is to wrap the System.Random class in an object implementing IEnumerable<int>. The goal of this step is to create System.Random object and seed it with a truly random value. Then use that object to construct endless stream of pseudo-random numbers. This stream would then be exposed in form of an IEnumerable<int> infinite sequence, just as we promised earlier.

Below is the complete implementation of the PseudoRandomSequence class which performs these responsibilities:

public class PseudoRandomSequence : IEnumerable<int>
{
    private System.Random Generator { get; }
    private int MinInclusive { get; set; }
    private int MaxExclusive { get; set; }
    private Action RefreshBounds { get; }

    public static IEnumerable<int> Create() =>
        new PseudoRandomSequence(0, int.MaxValue);

    public static IEnumerable<int> Create(int maxExclusive) =>
        new PseudoRandomSequence(0, maxExclusive);

    public static IEnumerable<int> Create(int minInclusive, int maxExclusive) =>
        new PseudoRandomSequence(minInclusive, maxExclusive);

    public static IEnumerable<int> Create(Func<(int minInclusive, int maxExclusive)> boundaryCurve) =>
        new PseudoRandomSequence(boundaryCurve);

    public static IEnumerable<int> Create(Func<int> maxExclusiveCurve) =>
        new PseudoRandomSequence(maxExclusiveCurve);

    private PseudoRandomSequence()
    {
        int seed = Seed.CreateRandom();
        this.Generator = new System.Random(seed);
        this.MinInclusive = this.MaxExclusive = 0;
        this.RefreshBounds = () => { };
    }

    private PseudoRandomSequence(int minInclusive, int maxExclusive) 
        : this()
    {
        this.MinInclusive = minInclusive;
        this.MaxExclusive = maxExclusive;
    }

    private PseudoRandomSequence(Func<(int minInclusive, int maxExclusive)> boundaryCurve) 
        : this()
    {
        this.RefreshBounds =
            () => (this.MinInclusive, this.MaxExclusive) = boundaryCurve();
    }

    private PseudoRandomSequence(Func<int> maxExclusiveCurve)
        : this()
    {
        this.RefreshBounds =
            () => this.MaxExclusive = maxExclusiveCurve();
    }

    public IEnumerator<int> GetEnumerator()
    {
        while (true)
        {
            this.RefreshBounds();
            yield return this.Generator.Next(this.MinInclusive, this.MaxExclusive);
        }
    }

    IEnumerator IEnumerable.GetEnumerator() =>
        this.GetEnumerator();
}

This class is using the contained System.Random object to repeatedly call the Next method on it. But this object is seeded with a random seed value. Therefore, the sequence produced by an instance of the PseudoRandomSequence object will be independent of the sequence generated by any other instance of this same class, no matter when exactly and on what thread that other instance was constructed.

The sequence class comes with several private constructors. It only exposes static factory functions. The first three factory functions each relate to one overload of the Next method on System.Random class:

public class PseudoRandomSequence : IEnumerable<int>
{
    ...
    public static IEnumerable<int> Create() => ...

    public static IEnumerable<int> Create(int maxExclusive) => ...

    public static IEnumerable<int> Create(int minInclusive, int maxExclusive) => ...
    ...
}

The caller is allowed to define no boundaries, or only an exclusive upper bound, or both lower (inclusive) and upper (exclusive) upper bounds. These bounds will be obeyed when pseudo-random numbers are generated later.

The other two static factory methods are offering more power, at the expense of a bit more calculation:

public class PseudoRandomSequence : IEnumerable<int>
{
    ...
    public static IEnumerable<int> Create(Func<(int minInclusive, int maxExclusive)> boundaryCurve) => ...

    public static IEnumerable<int> Create(Func<int> maxExclusiveCurve) => ...
    ...
}

These two functions are letting the caller specify dynamic boundaries for the random numbers that will appear in the sequence. Whenever a new pseudo-random number is to be generated, the generator calls the supplied function first, to determine boundaries for the next number.

Whichever Create static method you choose to call, you will receive an object implementing IEnuemrable<int>. These objects will obviously be instances of PseudoRandomSequence class, but caller doesn’t have to bother with that detail. It is quite sufficient for any practical reason to depend on IEnumerable<int> only.

Demonstration

The class outlined in this article is very simple from the caller’s point of view. The caller is only supposed to give the sequence parameters, and pseudo-random numbers will start flowing in the endless stream of integers. How hard can it be to use that sequence then?

I will give two applications of the PseudoNumberSequence class – one trivial, and then a more elaborate one.

int[] values = PseudoRandomSequence.Create(0, 20).Take(10).ToArray();
// e.g. [15, 9, 11, 12, 7, 6, 1, 9, 11, 19]

If you only wanted to generate an array consisting of random integers from a given range, then this is the way to do it – one line of code. On a closer look, we find that this expression consists of three steps which are chaining one after another. The first step is to create a sequence of pseudo-random numbers greater or equal to zero and less than 20. That is what call to Create(0, 20) static factory function on the PseudoRandomSequence class is doing. This call will return an IEnumerable<int> which represents a lazy-evaluated infinite sequence of pseudo-random numbers.

The next operation in line will also be lazy-evaluated, but it will bound the length of the sequence to at most 10 elements. After that, the third step will follow, where we are finally pulling those 10 elements from the original sequence and turning them into an array.

It is important to understand that this way of dealing with random sequences is optimal, as the sequence itself is lazy all the way through. Only when the call which materializes the results is made, only then will the random numbers be generated, but no sooner than that. And even then, only the requested number of numbers will be generated.

So much about performance. Code simplicity is the other important trait of this approach. Compare the code above to traditional implementation based on System.Random, and you will quickly realize the full benefit of working with IEnumerable<int>:

int[] values = new int[10];
Random generator = new Random(); // What seed?
for (int index = 0; index < values.Length; index++)
    values[index] = generator.Next(0, 20);

The next example will have to do with the shuffling problem. Here is the entire code which shuffles a string:

string shuffle = "something <-> " + "something".Shuffle().Join();
// e.g. something <-> inshogmte

Yet another one liner. Though, this one liner is riding on the back of a larger custom extension method Shuffle, followed by another extension method Join. Let’s take a look at these two helper methods:

static class Extensions
{
    public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> sequence) =>
        sequence.ToList().Shuffle();

    private static IEnumerable<T> Shuffle<T>(this List<T> list) =>
        PseudoRandomSequence.Create(() => list.Count)
            .Take(list.Count)
            .Select(list.StatefulRemove);

    private static T StatefulRemove<T>(this List<T> data, int index)
    {
        T value = data[index];
        data[index] = data[data.Count - 1];
        data.RemoveAt(data.Count - 1);
        return value;
    }

    public static string Join(this IEnumerable<char> characters) =>
        new string(characters.ToArray());
}

The Shuffle extension method is implementing the Fisher-Yates algorithm. You can learn more about this algorithm from article Card Shuffling Problem. Internally, the Shuffle function needs to pick one out of the remaining, still not picked, items from the original data stream. This task is done by applying the PseudoRandomSequence with varying upper bound.

If you look at the way in which the pseudo-random sequence is generated, it uses the function to calculate the upper bound in every iteration of the loop. That upper bound, on the other hand, will always be the remaining number of elements to shuffle:

PseudoRandomSequence.Create(() => list.Count)

This line of code shows how simple it is to consume random numbers in a practical algorithm, even when the bounds are changing as the algorithm progresses.

The rest of the algorithm implementation is based on the idea that one element is removed from the list from a random position in every iteration. That is exactly what Fisher-Yates algorithm proposes. Since that process changes the list, I have opted to hide its behavior behind a private extension method StatefulRemove. This is because this method will be used in the LINQ operator, and therefore it should be side-effect-free.

It is a general rule that if a method used in LINQ produces side effects, then it must be hidden from public view. The public view must remain side-effect-free. Any state mutations that might be required to complete the operation should be hidden from the caller, so that all calls the caller might make will remain side-effect-free, as expected.

Summary

In this article we have demonstrated one very simple technique of wrapping the System.Random class into an infinite IEnumerable<int> sequence of pseudo-random numbers. We have avoided specialized API of the System.Random and hidden it behind a faceless IEnumerable<int> interface.

Another benefit of this solution is that it seeds the System.Random object from cryptographically secure stream. That makes all sequences of pseudo-random numbers isolated, which is useful in multi-threaded settings.

See also:

Published: Apr 10, 2018; Modified: Apr 9, 2018

ZORAN HORVAT

Zoran is software architect dedicated to clean design and CTO in a growing software company. Since 2014 Zoran is an author at Pluralsight where he is preparing a series of courses on object-oriented and functional design, design patterns, writing unit and integration tests and applying methods to improve code design and long-term maintainability.

Follow him on Twitter @zoranh75 to receive updates and links to new articles.

Watch Zoran's video courses at pluralsight.com (requires registration):

Making Your C# Code More Object-Oriented

This course will help leverage your conceptual understanding to produce proper object-oriented code, where objects will completely replace procedural code for the sake of flexibility and maintainability. More...

Advanced Defensive Programming Techniques

This course will lead you step by step through the process of developing defensive design practices, which can substitute common defensive coding, for the better of software design and implementation. More...

Tactical Design Patterns in .NET: Creating Objects

This course sheds light on issues that arise when implementing creational design patterns and then provides practical solutions that will make our code easier to write and more stable when running. More...

Tactical Design Patterns in .NET: Managing Responsibilities

Applying a design pattern to a real-world problem is not as straight-forward as literature implicitly tells us. It is a more engaged process. This course gives an insight to tactical decisions we need to make when applying design patterns that have to do with separating and implementing class responsibilities. More...

Tactical Design Patterns in .NET: Control Flow

Improve your skills in writing simpler and safer code by applying coding practices and design patterns that are affecting control flow. More...

Writing Highly Maintainable Unit Tests

This course will teach you how to develop maintainable and sustainable tests as your production code grows and develops. More...

Improving Testability Through Design

This course tackles the issues of designing a complex application so that it can be covered with high quality tests. More...

Share this article

webmasters