by Zoran Horvat
Generating random numbers is a common requirement. Even without security-related considerations, many algorithms we implement in day-to-day programming are randomized by their nature. For that reason, it is truly amusing to see the .NET Framework still circling around the System.Random class, as it ever did.
In this article, we will identify common issues related to use of System.Random class. Then, we will propose an alternate randomization solution which is aligned with modern C# syntax and design principles.
The Random class has been around since .NET version 1.1 and it was the foundation of so many randomized implementations ever since. This class represents a pseudo-random number generator, which has to be seeded before use.
That will be the first issue to note here, because seeding mechanism which is built into the Random class is using current system time. If you do not specify your own seed value, and in most applications you won’t have a random seed readily available, then the Random class will read current system time to come up with its own seed value.
This limitation becomes a serious obstacle in the design. A common solution to the problem is to have one entity responsible to instantiate one Random object and then distribute that object to all subordinate objects which are implementing randomized operations. This design is disruptive and leaky.
The other problem appears in multi-threaded environments, where one Random instance cannot suffice. We would have to create one instance per thread to make its use safe. That brings the same problem back in the game, as threads can start in short time intervals. If each Random instance is created about the time when the thread starts, then all random sequences will be the same again in all of the threads.
Further down the stream, at the level of user-friendliness, we find that the Random class is telling the next random number via the state-mutating method Next (and a few others, but let’s stick with the simplest one of them). Generating many pseudo-random numbers, and applying some transformation to each of them, means to iterate through the loop and apply the transformation explicitly:
List<MyClass> output = new List<MyClass>();
Random rand = new Random();
for (int i = 0; i < 1000; i++)
{
MyClass obj = helper.Transform(rand.Next());
output.Add(obj);
}
This approach is so outdated. Why wouldn’t we ask for an IEnumerable<int>, for example, which would be an infinite sequence of pseudo-random numbers? If we had it, the code above would boil down to a one-liner:
IEnumerable<int> randomNumbers = ...
List<MyClass> output = randomNumbers.Select(helper.Transform).ToList();
Now that we have outlined the basic capabilities of the existing System.Random class, we can start thinking of a better random number API to supersede it.
For one thing, we can use cryptographically secure random source to generate a 32-bit random seed. This time, the seed value would not be bound to system time. Consequently, we would be able to construct several pseudo-random number generators almost simultaneously and each of them would still generate a different sequence of numbers.
The next requirement is to construct pseudo-random numbers in form of an endless IEnumerable<int> sequence. That would make it convenient for consumption through a foreach loop, or any other standard mechanism which consumes a sequence, like LINQ operators; this would oppose to forcing all consumers to bind to a strongly typed System.Random API.
With these two requirements alone, we would be able to construct a pseudo-random number sequence which is independent of any global state. The sequence would also save its consumers from having to depend on any concrete API.
In this solution, we will address two aspects discussed above. One will be to seed the generator in a way which doesn’t depend on system time. The other will be to construct a streaming API, which would expose pseudo-random numbers as a sequence IEnumerable<int>.
Seeding first. Below is the Seed class, which constructs a random 32-bit integer number using cryptographically secure random number generator.
public static class Seed
{
public static int CreateRandom()
{
using (System.Security.Cryptography.RandomNumberGenerator numberGenerator =
System.Security.Cryptography.RandomNumberGenerator.Create())
{
byte[] values = new byte[4];
numberGenerator.GetBytes(values);
int seed = 0;
foreach (byte value in values)
seed = (seed << 8) | value;
return seed;
}
}
}
This technique is somewhat dangerous, as repeated use of crypto-API can lead to deterioration of its service. That topic is beyond scope of this article. We will only emphasize that taking four bytes to initialize a random number generator will not raise concerns regarding the computer’s safety. Taking a number of seeds in a row will still not pose a problem. But taking thousands of seeds could be a risky operation.
The second part of the solution is to wrap the System.Random class in an object implementing IEnumerable<int>. The goal of this step is to create System.Random object and seed it with a truly random value. Then use that object to construct endless stream of pseudo-random numbers. This stream would then be exposed in form of an IEnumerable<int> infinite sequence, just as we promised earlier.
Below is the complete implementation of the PseudoRandomSequence class which performs these responsibilities:
public class PseudoRandomSequence : IEnumerable<int>
{
private System.Random Generator { get; }
private int MinInclusive { get; set; }
private int MaxExclusive { get; set; }
private Action RefreshBounds { get; }
public static IEnumerable<int> Create() =>
new PseudoRandomSequence(0, int.MaxValue);
public static IEnumerable<int> Create(int maxExclusive) =>
new PseudoRandomSequence(0, maxExclusive);
public static IEnumerable<int> Create(int minInclusive, int maxExclusive) =>
new PseudoRandomSequence(minInclusive, maxExclusive);
public static IEnumerable<int> Create(Func<(int minInclusive, int maxExclusive)> boundaryCurve) =>
new PseudoRandomSequence(boundaryCurve);
public static IEnumerable<int> Create(Func<int> maxExclusiveCurve) =>
new PseudoRandomSequence(maxExclusiveCurve);
private PseudoRandomSequence()
{
int seed = Seed.CreateRandom();
this.Generator = new System.Random(seed);
this.MinInclusive = this.MaxExclusive = 0;
this.RefreshBounds = () => { };
}
private PseudoRandomSequence(int minInclusive, int maxExclusive)
: this()
{
this.MinInclusive = minInclusive;
this.MaxExclusive = maxExclusive;
}
private PseudoRandomSequence(Func<(int minInclusive, int maxExclusive)> boundaryCurve)
: this()
{
this.RefreshBounds =
() => (this.MinInclusive, this.MaxExclusive) = boundaryCurve();
}
private PseudoRandomSequence(Func<int> maxExclusiveCurve)
: this()
{
this.RefreshBounds =
() => this.MaxExclusive = maxExclusiveCurve();
}
public IEnumerator<int> GetEnumerator()
{
while (true)
{
this.RefreshBounds();
yield return this.Generator.Next(this.MinInclusive, this.MaxExclusive);
}
}
IEnumerator IEnumerable.GetEnumerator() =>
this.GetEnumerator();
}
This class is using the contained System.Random object to repeatedly call the Next method on it. But this object is seeded with a random seed value. Therefore, the sequence produced by an instance of the PseudoRandomSequence object will be independent of the sequence generated by any other instance of this same class, no matter when exactly and on what thread that other instance was constructed.
The sequence class comes with several private constructors. It only exposes static factory functions. The first three factory functions each relate to one overload of the Next method on System.Random class:
public class PseudoRandomSequence : IEnumerable<int>
{
...
public static IEnumerable<int> Create() => ...
public static IEnumerable<int> Create(int maxExclusive) => ...
public static IEnumerable<int> Create(int minInclusive, int maxExclusive) => ...
...
}
The caller is allowed to define no boundaries, or only an exclusive upper bound, or both lower (inclusive) and upper (exclusive) upper bounds. These bounds will be obeyed when pseudo-random numbers are generated later.
The other two static factory methods are offering more power, at the expense of a bit more calculation:
public class PseudoRandomSequence : IEnumerable<int>
{
...
public static IEnumerable<int> Create(Func<(int minInclusive, int maxExclusive)> boundaryCurve) => ...
public static IEnumerable<int> Create(Func<int> maxExclusiveCurve) => ...
...
}
These two functions are letting the caller specify dynamic boundaries for the random numbers that will appear in the sequence. Whenever a new pseudo-random number is to be generated, the generator calls the supplied function first, to determine boundaries for the next number.
Whichever Create static method you choose to call, you will receive an object implementing IEnuemrable<int>. These objects will obviously be instances of PseudoRandomSequence class, but caller doesn’t have to bother with that detail. It is quite sufficient for any practical reason to depend on IEnumerable<int> only.
The class outlined in this article is very simple from the caller’s point of view. The caller is only supposed to give the sequence parameters, and pseudo-random numbers will start flowing in the endless stream of integers. How hard can it be to use that sequence then?
I will give two applications of the PseudoNumberSequence class – one trivial, and then a more elaborate one.
int[] values = PseudoRandomSequence.Create(0, 20).Take(10).ToArray();
// e.g. [15, 9, 11, 12, 7, 6, 1, 9, 11, 19]
If you only wanted to generate an array consisting of random integers from a given range, then this is the way to do it – one line of code. On a closer look, we find that this expression consists of three steps which are chaining one after another. The first step is to create a sequence of pseudo-random numbers greater or equal to zero and less than 20. That is what call to Create(0, 20) static factory function on the PseudoRandomSequence class is doing. This call will return an IEnumerable<int> which represents a lazy-evaluated infinite sequence of pseudo-random numbers.
The next operation in line will also be lazy-evaluated, but it will bound the length of the sequence to at most 10 elements. After that, the third step will follow, where we are finally pulling those 10 elements from the original sequence and turning them into an array.
It is important to understand that this way of dealing with random sequences is optimal, as the sequence itself is lazy all the way through. Only when the call which materializes the results is made, only then will the random numbers be generated, but no sooner than that. And even then, only the requested number of numbers will be generated.
So much about performance. Code simplicity is the other important trait of this approach. Compare the code above to traditional implementation based on System.Random, and you will quickly realize the full benefit of working with IEnumerable<int>:
int[] values = new int[10];
Random generator = new Random(); // What seed?
for (int index = 0; index < values.Length; index++)
values[index] = generator.Next(0, 20);
The next example will have to do with the shuffling problem. Here is the entire code which shuffles a string:
string shuffle = "something <-> " + "something".Shuffle().Join();
// e.g. something <-> inshogmte
Yet another one liner. Though, this one liner is riding on the back of a larger custom extension method Shuffle, followed by another extension method Join. Let’s take a look at these two helper methods:
static class Extensions
{
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> sequence) =>
sequence.ToList().Shuffle();
private static IEnumerable<T> Shuffle<T>(this List<T> list) =>
PseudoRandomSequence.Create(() => list.Count)
.Take(list.Count)
.Select(list.StatefulRemove);
private static T StatefulRemove<T>(this List<T> data, int index)
{
T value = data[index];
data[index] = data[data.Count - 1];
data.RemoveAt(data.Count - 1);
return value;
}
public static string Join(this IEnumerable<char> characters) =>
new string(characters.ToArray());
}
The Shuffle extension method is implementing the Fisher-Yates algorithm. You can learn more about this algorithm from article Card Shuffling Problem . Internally, the Shuffle function needs to pick one out of the remaining, still not picked, items from the original data stream. This task is done by applying the PseudoRandomSequence with varying upper bound.
If you look at the way in which the pseudo-random sequence is generated, it uses the function to calculate the upper bound in every iteration of the loop. That upper bound, on the other hand, will always be the remaining number of elements to shuffle:
PseudoRandomSequence.Create(() => list.Count)
This line of code shows how simple it is to consume random numbers in a practical algorithm, even when the bounds are changing as the algorithm progresses.
The rest of the algorithm implementation is based on the idea that one element is removed from the list from a random position in every iteration. That is exactly what Fisher-Yates algorithm proposes. Since that process changes the list, I have opted to hide its behavior behind a private extension method StatefulRemove. This is because this method will be used in the LINQ operator, and therefore it should be side-effect-free.
It is a general rule that if a method used in LINQ produces side effects, then it must be hidden from public view. The public view must remain side-effect-free. Any state mutations that might be required to complete the operation should be hidden from the caller, so that all calls the caller might make will remain side-effect-free, as expected.
In this article we have demonstrated one very simple technique of wrapping the System.Random class into an infinite IEnumerable<int> sequence of pseudo-random numbers. We have avoided specialized API of the System.Random and hidden it behind a faceless IEnumerable<int> interface.
Another benefit of this solution is that it seeds the System.Random object from cryptographically secure stream. That makes all sequences of pseudo-random numbers isolated, which is useful in multi-threaded settings.
If you wish to learn more, please watch my latest video courses
In this course, you will learn the basic principles of object-oriented programming, and then learn how to apply those principles to construct an operational and correct code using the C# programming language and .NET.
As the course progresses, you will learn such programming concepts as objects, method resolution, polymorphism, object composition, class inheritance, object substitution, etc., but also the basic principles of object-oriented design and even project management, such as abstraction, dependency injection, open-closed principle, tell don't ask principle, the principles of agile software development and many more.
More...
In this course, you will learn how design patterns can be applied to make code better: flexible, short, readable.
You will learn how to decide when and which pattern to apply by formally analyzing the need to flex around specific axis.
More...
This course begins with examination of a realistic application, which is poorly factored and doesn't incorporate design patterns. It is nearly impossible to maintain and develop this application further, due to its poor structure and design.
As demonstration after demonstration will unfold, we will refactor this entire application, fitting many design patterns into place almost without effort. By the end of the course, you will know how code refactoring and design patterns can operate together, and help each other create great design.
More...
In four and a half hours of this course, you will learn how to control design of classes, design of complex algorithms, and how to recognize and implement data structures.
After completing this course, you will know how to develop a large and complex domain model, which you will be able to maintain and extend further. And, not to forget, the model you develop in this way will be correct and free of bugs.
More...
Zoran Horvat is the Principal Consultant at Coding Helmet, speaker and author of 100+ articles, and independent trainer on .NET technology stack. He can often be found speaking at conferences and user groups, promoting object-oriented and functional development style and clean coding practices and techniques that improve longevity of complex business applications.