Exercise #10: Paginating an Unsorted Array by Zoran Horvat @zoranh75
Problem Statement
Given an unsorted array containing N integer numbers, write a function which extracts one page from sorted version
of the array. Page is a range of elements occupying positions M, M+1, M+2, ... , M+P1 in the array, where M marks beginning of the page (0 <= M < N) and P is the page size (M+P1 < N). Function should place elements belonging to the requested page to their
dedicated locations, so that the requested page is sorted when function
completes.
Example: Suppose that array consists of numbers 6, 2, 5, 7, 8, 3, 1, 4, 9.
Pages consist of four elements each (except the last one) and we require second
page (M=4, P=4). After modifying the array, numbers 5, 6, 7 and 8 should be located starting
from position 4. For example, this array content is a legitimate solution: 9,
2, 1, 3, 5, 6, 7, 8, 4. Note that other values in the array are not relevant,
as long as the second page is sorted and in place.
Keywords: Array, sorting, pagination.
Problem Analysis
This is obviously a sorting problem and it is trivially solved by sorting the
whole array. In that case all elements will be in their proper places in the
sorted order, and so will be the requested page. But this solution does more
than it is asked for. It prepares all other pages as well, pages into which the
caller is not really interested. Imagine an array containing millions of
numbers and a call made to the function to sort only elements on positions
5059, inclusive. This would be the sixth page, where each page contains ten
elements. There is no point in sorting the remaining 999,990 numbers in the
array. One might think that the caller is likely to call the function many
times again, asking for all the other pages, so why not sorting the whole array
immediately? The answer is that, when things come to large arrays, it often
happens that the caller is interested only in a tiny fraction of data
contained.
As an illustration, consider a problem of querying a search engine with a
relatively general query. The engine might detect millions of Web pages that
satisfy the query, but only the first couple of them (say, ten links) will be
communicated back to the customer. When customer asks for the second page with
results, only then will the search engine bother to sort the following ten
links. It rarely happens that any Web user walks more than a couple of pages at
length before finding the desired document (or modifying the search query).
Sorting the whole result set would be a tremendous waste of resources on the
server because user is eventually going to leave without looking into anything
beyond the leading few pages.
The conclusion is that efficient function should focus only on the requested
range of elements. Any operation that does not get us nearer to the goal of
bounding that range should be avoided. One way to approach the problem is to
modify the sorting algorithm so that it only partially sorts the array. Good
candidate for this idea is the QUICKSORT algorithm. Its partitioning step could eliminate quite a lot of elements that
fall below or above the requested page. The following picture shows how to
extract the second page containing three elements from a somewhat longer array
than the one in the previous example.
In this example we have not fully sorted the array. When deciding whether to
sort elements that fall to one specific side from the pivot element (left or
right), we check whether elements on that side overlap with requested positions
3, 4 and 5. If not, then there is no point in sorting that range of the array –
such discarded elements are shaded gray in the figure above. As process
continues, shaded parts of the array grow larger, until only the requested page
remains for sorting. Some of the shaded elements are indeed sorted, but that is
the pure coincidence. There is no intention to sort anything below or above the
page.
This solution is efficient because it tends to identify elements that do not
belong to the requested page and to discard them completely from further steps.
Note that shaded parts of the array are not sorted at the end of the program
execution. This is consequence of the fact that they are ignored by the
algorithm.
Here is the pseudocode which solves the pagination problem:
function Paginate(a, n, m, p)
a – unsorted array
n – number of elements in the array
m – position of the first element in the page
p – page size
begin
PaginateRecursive(a, 0, n – 1, m, p)
end
function PaginateRecursive(a, left, right, m, p)
a – unsorted array
left – position of the first element in current scope
right – position of the last element in current scope
m – position of the first element in the page
p – page size
begin
pivotPos = Partition(a, left, right)  Partition function from QUICKSORT
 Recursively paginate elements to the left of the pivot
 if lefthand partition overlaps with requested page
 These elements are located between left and pivotPos1, inclusively
if (left < m + p AND pivotPos – 1 >= m AND pivotPos > left)
PaginateRecursive(a, left, pivotPos – 1, m, p)
 Recursively paginate elements to the right of the pivot
 if righthand partition overlaps with requested page
 These elements are located between pivotPos+1 and right, inclusively
if (pivotPos + 1 < m + p AND right >= m AND pivotPos < right)
PaginateRecursive(a, pivotPos + 1, right, m, p)
end
The main function is Paginate – it receives the array, its length and
information about the page requested. But this function merely delegates the
call to the recursive implementation (PaginateRecursive function), which in
turn receives the bounds within the array which are currently in scope. This
function relies on the Partition function from QUICKSORT algorithm, for which any implementation will suffice. For example,
partitioning could be implemented like this:
function Partition(a, left, right)
a – unsorted array
left – index of the first element of the range to partition
right – index of the last element of the range to partition
begin
pivot = a[left]
pos = left + 1
while pos <= right
begin
if a[pos] < pivot then
begin
a[left] = a[pos]
left = left + 1
pos = pos + 1
end
else
begin
tmp = a[right]
a[right] = a[pos]
a[pos] = tmp
right = right  1
end
end
a[left] = pivot
return left
end
In this implementation the first element in the range is picked up to be the
pivot. After that, all elements that are strictly less than the pivot are moved
towards the beginning of the range and all elements greater or equal to the
pivot are moved towards the end of the range. Finally, pivot is placed in
between and its new position is returned from the function.
Note that this implementation of the partitioning function is not very
efficient. It may lead to high worst case running time in case when pivot falls
to one or the end of the partitioned range. Readers are advised to provide the
Partition function implementation with better pivot selection. Anyway, with
this function in place, we are ready to provide the complete pagination
solution.
Implementation
Below is implementation for the Paginate function and the two helper functions it relies on.
void Paginate(int[] a, int m, int p)
{
PaginateRecursive(a, 0, a.Length  1, m, p);
}
void PaginateRecursive(int[] a, int left, int right, int m, int p)
{
int pivotPos = Partition(a, left, right);
if (left < m + p && pivotPos  1 >= m && pivotPos > left)
PaginateRecursive(a, left, pivotPos  1, m, p);
if (pivotPos + 1 < m + p && right >= m && pivotPos < right)
PaginateRecursive(a, pivotPos + 1, right, m, p);
}
int Partition(int[] a, int left, int right)
{
int pivot = a[left];
int pos = left + 1;
while (pos <= right)
{
if (a[pos] < pivot)
{
a[left++] = a[pos++];
}
else
{
int tmp = a[right];
a[right] = a[pos];
a[pos] = tmp;
right;
}
}
a[left] = pivot;
return left;
}
Demonstration
In this section we are providing the complete source code of a console
application which demonstrates using the Paginate function.
using System;
namespace Pagination
{
class Program
{
private static Random _rnd = new Random();
public static void ShuffleArray(int[] a)
{
for (int i = a.Length  1; i > 0; i)
{
int index = _rnd.Next(i + 1);
int tmp = a[i];
a[i] = a[index];
a[index] = tmp;
}
}
public static int[] GenerateArray(int n)
{
int[] a = new int[n];
for (int i = 0; i < n; i++)
a[i] = i + 1;
ShuffleArray(a);
return a;
}
public static void Paginate(int[] a, int m, int p) { ... }
public static void PaginateRecursive(int[] a, int left, int right,
int m, int p) { ... }
public static int Partition(int[] a, int left, int right) { ... }
public static void PrintArray(int[] a, int m, int p)
{
for (int i = 0; i < a.Length; i++)
Console.Write("{0}{1,2}{2}",
i == m ? "[" : " ",
a[i],
i == m + p  1 ? "]" : " ");
if (m + p > a.Length)
Console.Write("]");
Console.WriteLine();
}
static void Main(string[] args)
{
while (true)
{
Console.Write("n=");
int n = int.Parse(Console.ReadLine());
if (n <= 0)
break;
Console.Write("m=");
int m = int.Parse(Console.ReadLine());
Console.Write("p=");
int p = int.Parse(Console.ReadLine());
int[] a = GenerateArray(n);
Paginate(a, m, p);
PrintArray(a, m, p);
}
Console.Write("Press ENTER to continue... ");
Console.ReadLine();
}
}
}
When application is run, it produces output like this:
n=10
m=3
p=3
1 3 2 [ 4 5 6] 7 8 10 9
n=15
m=8
p=4
4 3 2 1 5 6 7 8 [ 9 10 11 12] 13 15 14
n=20
m=3
p=3
1 3 2 [ 4 5 6] 7 8 9 10 11 12 20 15 16 19 14 17 18 13
n=0
Press ENTER to continue...
FollowUp Exercises
Readers are advised to modify existing solution and implement a function which
sorts next page in a sequence, assuming that all previous pages, i.e. pages
with smaller starting positions, are already sorted. Write a function with this
signature:
void PaginateNext(int[] a, int m, int p)
All elements of the array in range 0 to m1 are already sorted. On output, function should ensure that all elements on
positions 0 to m+p1 are sorted.
See also:
Published: May 24, 2013
ZORAN HORVAT Zoran is software architect dedicated to clean design and CTO in a growing software company. Since 2014 Zoran is an author at Pluralsight where he is preparing a series of courses on design patterns, writing unit and integration tests and applying methods to improve code design and longterm maintainability. Follow him on Twitter @zoranh75 to receive updates and links to new articles. Watch Zoran's video courses at pluralsight.com (requires registration): Tactical Design Patterns in .NET: Managing Responsibilities Applying a design pattern to a realworld problem is not as straightforward as literature implicitly tells us. It is a more engaged process. This course gives an insight into tactical decisions we need to make when applying design patterns that have to do with separating and implementing class responsibilities. More... Tactical Design Patterns in .NET: Control Flow Improve your skills in writing simpler and safer code by applying coding practices and design patterns that are affecting control flow. More... Improving Testability Through Design This course tackles the issues of designing a complex application so that it can be covered with high quality tests. More... Share this article
