by Zoran Horvat
An array is given consisting of integer numbers. Each number except one appears exactly twice. The remaining number appears only once in the array. Write a function which returns a number which appears only once in the array.
Example: Suppose that array contains values 1, 4, 2, 1, 3, 4, 2. The function should return value 3, because that it occurs once in the array. Values 1, 2 and 4 occur two times each.
Keywords: Array, duplicate values.
If you are interested in similar exercise, you may read another article in which we are solving similar problem, with only difference being that there are two numbers that appear once in the array of duplicated integer numbers. This exercise can be found on page Finding Two Numbers That Appear Once in Array of Duplicated Numbers .
If we suppose that array is sorted, then solution to the problem would be very simple. We would traverse the array and check successive values in pairs. As soon as a pair of different values is reached, the first value in the pair is the one that occurs only once. If end of array is reached, then the last value in the array is the one we are looking for.
This solution cannot be applied directly because array is not sorted. In order to sort the array we need O(NlogN) operations. Once the array is sorted, traversal requires only O(N) steps. But is there a more efficient solution, such that solves the problem in O(N) steps total without previously sorting the array?
To find the efficient solution we need to somehow mimic the solution on sorted array. The key question here is which properties can be attributed to numbers that appear twice in the array, so that we can eliminate them while traversing the unsorted array even when those numbers do not appear in successive places. One such property would be to test parity of each number and to count how many even and odd numbers there are in the array. In example given in the problem statement there are total of seven numbers in the array, four of them even and three of them odd. What can we see from this result? Each odd number having a pair adds two occurrences to the count of odd elements; the same stands for even numbers. The one value without a pair, being even or odd in itself, adds only one to the corresponding count. So from even and odd counts we can figure out the parity of the solitary element. Having three odd numbers in the example array, we conclude that requested number must be odd. And really, we are looking for value 3, which is indeed odd.
This analysis still doesn't lead to the solution, but it is a good demonstration that some attributes of numbers are preserved during array traversal even when coinciding numbers are located far away from each others. We can go with this analysis further - ignoring the parity, we can divide all numbers by two and then count parities of the resulting numbers. When this result is combined with parity test, we obtain perfect information about what is the remainder when requested number is divided by four. Now this starts to take shape of a solution to the problem.
We can abstract the parity analysis by observing that we do not need to know exact counts of even and odd values. We only require the information which of the counts is odd. Even more, if number of odd numbers is odd itself, then the result is odd as well. Consequently, requested number's parity is equal to parity of a number obtained by only counting the odd numbers in the array! Below is the pseudo-code which answers the question what is the parity of the number appearing only once in the array.
a[0..N-1] - array containing integer values
parity = 0
for i = 0, N-1
if a[i] mod 2 = 1 then
parity = 1 - parity
This simple loop produces value 1 if number of odd values in the array is odd, and value 0 otherwise. In other words, variable parity will contain the parity of the number that appears exactly once in the array.
However, to make better use of this loop, we can perform modulo division and parity recalculation by using bitwise operations. Remember that modulo 2 division is equal to AND-ing the value with mask 1. Conversely, inverting the parity, i.e. replacing 0 with 1 and vice versa, can be achieved by XOR-ing current value with mask 1. Remember also that XOR-ing any value with mask 0 leaves the value unchanged. Now that we know all this, the loop above can be rewritten into more compact form:
parity = 0
for i = 0, N - 1
parity = parity XOR (a[i] AND 1)
Both XOR and AND are bitwise rather than Boolean operations. In this way we have produced a variable parity with lowest bit equal to the lowest bit of the number which should be isolated from the array. Notice how we have gradually shifted from the parity analysis into the specter of bitwise operations. These two areas are closely related because parity of an integer number is indicated by value of its least significant binary digit. By performing all operations only on the least significant bit of the parity indicator produces the same result as all the arithmetic operations that relied on modulo 2 division. These bitwise operations prove their true value when we ask what is the value of the second least significant bit of the singular value in the array. This result is obtained the same as in case of least significant bit, only mask 2 is used to extract the second lowest bit:
parity1 = 0
for i = 0, N-1
parity1 = parity1 XOR (a[i] AND 2)
Two parity bits extracted this far can be combined as simply as parity OR parity1, which now turns into a value equal to the requested number modulo 4. Notice that the same value can be reached with only one loop, by using a combination of the two masks - mask with value 3. The reason is simple: having all the bitwise operations along the route operate on distinct bits in the result, there is no fear that parity of one bit will affect the observed parity of another bit.
And this leads to the final solution. If we completely remove the mask, then we will obtain the loop which simultaneously tracks parities of all bits in the number which appears only once in the array. When the loop terminates, parity indicator will have all of its bits equal to corresponding bits in the requested number. In other words, parity indicator will be the requested number itself! Here is the final pseudo-code:
function Singular(a, N)
value = 0
for i = 0, N-1
value = value XOR a[i]
return value
end
Once again, the XOR operation used is the bitwise operation between the two operands.
Now that we have produced the final solution, we can take a look at it the other way around. We are simply XOR-ing all the numbers in the array, and that operation magically produces the number that appears exactly once in the array. The reason why this method works comes from the fact that all other numbers appear exactly twice each. And, whenever a number is XOR-ed with itself, value 0 is produced. And when any number is XOR-ed with zero, it remains unchanged. Therefore, second appearance of any number simply removes the first appearance of the same number from the cumulative result. What remains in the end is only the number which doesn't appear the second time and, consequently, could not score out its first appearance. This explains why the simple loop produced above works correctly. On a related note, we could start from this short analysis and come up with exactly the same code for the solution.
Now that we have produced the pseudo-code which solves the problem, the corresponding C# function can be written with very little effort:
int Singular(int[] a)
{
int value = 0;
for (int i = 0; i < a.Length; i++)
value = value ^ a[i];
return value;
}
We can use the Singular function to analyze the array given as an example in the problem statement section:
int[] a = new int[] { 1, 4, 2, 1, 3, 4, 2 };
int s = Singular(a);
Console.WriteLine("Value that appears once is {0}.", s);
This code produces output as expected:
Value that appears once is 3.
Given the array containing strings rather than integers, can the XOR-ing technique be applied to find the string which appears once, as opposed to all other strings that appear exactly twice each? Note that strings are not necessarily of the same length.
Try to solve modified problem, in which there are two distinct numbers that appear once each, while all other numbers appear exactly twice each. This exercise is covered on page Finding Two Numbers That Appear Once in Array of Duplicated Numbers .
If you wish to learn more, please watch my latest video courses
In this course, you will learn the basic principles of object-oriented programming, and then learn how to apply those principles to construct an operational and correct code using the C# programming language and .NET.
As the course progresses, you will learn such programming concepts as objects, method resolution, polymorphism, object composition, class inheritance, object substitution, etc., but also the basic principles of object-oriented design and even project management, such as abstraction, dependency injection, open-closed principle, tell don't ask principle, the principles of agile software development and many more.
More...
In this course, you will learn how design patterns can be applied to make code better: flexible, short, readable.
You will learn how to decide when and which pattern to apply by formally analyzing the need to flex around specific axis.
More...
This course begins with examination of a realistic application, which is poorly factored and doesn't incorporate design patterns. It is nearly impossible to maintain and develop this application further, due to its poor structure and design.
As demonstration after demonstration will unfold, we will refactor this entire application, fitting many design patterns into place almost without effort. By the end of the course, you will know how code refactoring and design patterns can operate together, and help each other create great design.
More...
In four and a half hours of this course, you will learn how to control design of classes, design of complex algorithms, and how to recognize and implement data structures.
After completing this course, you will know how to develop a large and complex domain model, which you will be able to maintain and extend further. And, not to forget, the model you develop in this way will be correct and free of bugs.
More...
Zoran Horvat is the Principal Consultant at Coding Helmet, speaker and author of 100+ articles, and independent trainer on .NET technology stack. He can often be found speaking at conferences and user groups, promoting object-oriented and functional development style and clean coding practices and techniques that improve longevity of complex business applications.