by Zoran Horvat
Given an unsorted array of N integers (N > 0) and an integer value v, write the function that returns zero-based index in the array at which the value v is located. Function should return negative value if array does not contain v. In case that there are multiple occurrences of v within the array, return smallest index at which v occurs.
Example 1: Suppose that array is: 7, 2, 6, 1, 8, 4, 5. For v=6 function returns 2. For v=3, function returns negative value (e.g. -1).
Keywords: Array search, index of value, index in array.
Searching through an unsorted array looks like a simple task - we just iterate through the array and return an element index as soon as the match is found. Alternatively, if end of array is reached, failure status is returned. Here is the pseudocode of the function:
function Search(a, v)
begin
a - array containing n elements
v - value to be found in the array
for i = 0 to n - 1
if a[i] = v then
return i
return -1
end
Now that we have the straight-forward solution, we can think of some improvements. To see the enhancement, we should observe the loop, which is the most demanding part of the function. Try to count operations that must be executed in order to complete the loop.
Here is the loop from the function above, stripped to the level of most basic instructions:
i = 0
loop: if i >= n then jump out
if a[i] != v then jump next
return i
next: i = i + 1
jump loop
out: return -1
This reveals that loop consists of two conditional jumps, one unconditional jump, one increment operation and one return statement. Note that return statement is executed at most once, so it can be dropped from this analysis. The heaviest instructions in the loop are three jumps and the increment operation. Now we should figure out how to eliminate at least one of these instructions. First conditional jump tests whether end of the array has been reached. Second conditional jump tests whether requested element was found in the array. Increment instruction is executed in order to move forward through the array. Finally, unconditional jump is executed to return back to the beginning of the loop block. Which one can be removed?
Increment operation cannot be removed simply because we have to progress through the array, as well as the unconditional jump at the end of the loop block. Conditional jump which tests array elements is also mandatory because we do not know which element equals the requested value, if any. Finally, conditional jump which tests whether we have stepped out of the array is the last resort. That statement is present because we do not know whether we are still within bounds of the array or not. In order to remove that statement from the array, we must ensure that we never exit the array, which can only happen if requested value exists somewhere inside it - condition which is not guaranteed in advance. However, we can fix that condition: just put the requested value at the end of the array and we have eliminated the need to test whether we have iterated outside bounds of the array. Here is the modified implementation:
tmp = a[n - 1]
a[n - 1] = v
pos = 0
while a[pos] != v
pos = pos + 1
a[n - 1] = tmp
if a[pos] = v then
return pos
return -1
In this way, we have tweaked the system which we are examining (array in this case). Modified system better suits our needs and lets us write more efficient solution. However, once the loop is executed, we must return the array in the original state. Only after that is accomplished can we test whether the requested element was found in the array or not and, depending on the outcome, return one result or the other.
Here is the C# implementation of enhanced searching function:
int Search(int[] array, int v)
{
int tmp = array[array.Length - 1];
array[array.Length - 1] = v;
int pos = -1;
while (array[++pos] != v);
array[array.Length - 1] = tmp;
if (array[pos] == v)
return pos;
return -1;
}
This function is faster than the simple implementation:
int SearchSlow(int[] array, int v)
{
for (int i = 0; i < array.Length; i++)
if (array[i] == v)
return i;
return -1;
}
When performance of these two functions is compared on very large arrays, containing millions of numbers, the first implementation runs in time by 25% shorter than time required by the "slow" function to complete. Same results are obtained when value is present in the array, as well as in cases when array does not contain the requested value. This proves the thesis that we have actually saved one out of four instructions in the loop by removing the array bounds test.
Here is the complete console application which can be used to demonstrate how Search function works:
using System;
namespace ArraySearch
{
class Program
{
static Random _rnd = new Random();
static int[] Generate(int n)
{
int[] a = new int[n];
for (int i = 0; i < n; i++)
a[i] = _rnd.Next(100);
return a;
}
static int Search(int[] array, int v) { ... }
static void Main(string[] args)
{
while (true)
{
Console.Write("n=");
int n = int.Parse(Console.ReadLine());
if (n <= 0) break;
Console.Write("v=");
int v = int.Parse(Console.ReadLine());
int[] a = Generate(n);
int pos = Search(a, v);
if (pos < 0)
Console.Write("{0} not found in array ", v);
else
Console.Write("{0} found at position {1} in array ", v, pos);
for (int i = 0; i < a.Length; i++)
Console.Write("{0,5}", a[i]);
Console.WriteLine();
}
Console.Write("Press ENTER to continue... ");
Console.ReadLine();
}
}
}
And here is the output produced when this application was run:
n=10
v=10
10 not found in array 19 0 12 18 9 18 14 9 1 11
n=17
v=12
12 found at position 13 in array 0 13 9 4 19 17 9 5 10 15
15 9 6 12 17 5 17
n=14
v=19
19 found at position 12 in array 13 13 11 17 3 4 3 2 4 5
18 9 19 0
n=6
v=18
18 found at position 2 in array 11 12 18 18 17 9
n=9
v=14
14 not found in array 0 3 11 13 5 3 0 1 5
n=5
v=17
17 not found in array 0 10 4 16 18
n=0
Press ENTER to continue...
If you wish to learn more, please watch my latest video courses
In this course, you will learn the basic principles of object-oriented programming, and then learn how to apply those principles to construct an operational and correct code using the C# programming language and .NET.
As the course progresses, you will learn such programming concepts as objects, method resolution, polymorphism, object composition, class inheritance, object substitution, etc., but also the basic principles of object-oriented design and even project management, such as abstraction, dependency injection, open-closed principle, tell don't ask principle, the principles of agile software development and many more.
More...
In this course, you will learn how design patterns can be applied to make code better: flexible, short, readable.
You will learn how to decide when and which pattern to apply by formally analyzing the need to flex around specific axis.
More...
This course begins with examination of a realistic application, which is poorly factored and doesn't incorporate design patterns. It is nearly impossible to maintain and develop this application further, due to its poor structure and design.
As demonstration after demonstration will unfold, we will refactor this entire application, fitting many design patterns into place almost without effort. By the end of the course, you will know how code refactoring and design patterns can operate together, and help each other create great design.
More...
In four and a half hours of this course, you will learn how to control design of classes, design of complex algorithms, and how to recognize and implement data structures.
After completing this course, you will know how to develop a large and complex domain model, which you will be able to maintain and extend further. And, not to forget, the model you develop in this way will be correct and free of bugs.
More...
Zoran Horvat is the Principal Consultant at Coding Helmet, speaker and author of 100+ articles, and independent trainer on .NET technology stack. He can often be found speaking at conferences and user groups, promoting object-oriented and functional development style and clean coding practices and techniques that improve longevity of complex business applications.