http://www.codinghelmet.com/  

Wear a helmet. Even when coding.

exercises > array-pagination

Exercise #10: Paginating an Unsorted Array
by Zoran Horvat @zoranh75

Problem Statement

Given an unsorted array containing N integer numbers, write a function which extracts one page from sorted version of the array. Page is a range of elements occupying positions M, M+1, M+2, ... , M+P-1 in the array, where M marks beginning of the page (0 <= M < N) and P is the page size (M+P-1 < N). Function should place elements belonging to the requested page to their dedicated locations, so that the requested page is sorted when function completes.

Example: Suppose that array consists of numbers 6, 2, 5, 7, 8, 3, 1, 4, 9. Pages consist of four elements each (except the last one) and we require second page (M=4, P=4). After modifying the array, numbers 5, 6, 7 and 8 should be located starting from position 4. For example, this array content is a legitimate solution: 9, 2, 1, 3, 5, 6, 7, 8, 4. Note that other values in the array are not relevant, as long as the second page is sorted and in place.

Keywords: Array, sorting, pagination.

Problem Analysis

This is obviously a sorting problem and it is trivially solved by sorting the whole array. In that case all elements will be in their proper places in the sorted order, and so will be the requested page. But this solution does more than it is asked for. It prepares all other pages as well, pages into which the caller is not really interested. Imagine an array containing millions of numbers and a call made to the function to sort only elements on positions 50-59, inclusive. This would be the sixth page, where each page contains ten elements. There is no point in sorting the remaining 999,990 numbers in the array. One might think that the caller is likely to call the function many times again, asking for all the other pages, so why not sorting the whole array immediately? The answer is that, when things come to large arrays, it often happens that the caller is interested only in a tiny fraction of data contained.

As an illustration, consider a problem of querying a search engine with a relatively general query. The engine might detect millions of Web pages that satisfy the query, but only the first couple of them (say, ten links) will be communicated back to the customer. When customer asks for the second page with results, only then will the search engine bother to sort the following ten links. It rarely happens that any Web user walks more than a couple of pages at length before finding the desired document (or modifying the search query). Sorting the whole result set would be a tremendous waste of resources on the server because user is eventually going to leave without looking into anything beyond the leading few pages.

The conclusion is that efficient function should focus only on the requested range of elements. Any operation that does not get us nearer to the goal of bounding that range should be avoided. One way to approach the problem is to modify the sorting algorithm so that it only partially sorts the array. Good candidate for this idea is the QUICKSORT algorithm. Its partitioning step could eliminate quite a lot of elements that fall below or above the requested page. The following picture shows how to extract the second page containing three elements from a somewhat longer array than the one in the previous example.

In this example we have not fully sorted the array. When deciding whether to sort elements that fall to one specific side from the pivot element (left or right), we check whether elements on that side overlap with requested positions 3, 4 and 5. If not, then there is no point in sorting that range of the array – such discarded elements are shaded gray in the figure above. As process continues, shaded parts of the array grow larger, until only the requested page remains for sorting. Some of the shaded elements are indeed sorted, but that is the pure coincidence. There is no intention to sort anything below or above the page.

This solution is efficient because it tends to identify elements that do not belong to the requested page and to discard them completely from further steps. Note that shaded parts of the array are not sorted at the end of the program execution. This is consequence of the fact that they are ignored by the algorithm.

Here is the pseudo-code which solves the pagination problem:

function Paginate(a, n, m, p)
a – unsorted array
n – number of elements in the array
m – position of the first element in the page
p – page size
begin
    PaginateRecursive(a, 0, n – 1, m, p)
end

function PaginateRecursive(a, left, right, m, p)
a – unsorted array
left – position of the first element in current scope
right – position of the last element in current scope
m – position of the first element in the page
p – page size
begin

    pivotPos = Partition(a, left, right) -- Partition function from QUICKSORT

    -- Recursively paginate elements to the left of the pivot
    --     if left-hand partition overlaps with requested page
    -- These elements are located between left and pivotPos-1, inclusively
    if (left < m + p AND pivotPos – 1 >= m AND pivotPos > left)
        PaginateRecursive(a, left, pivotPos – 1, m, p)

    -- Recursively paginate elements to the right of the pivot
    --     if right-hand partition overlaps with requested page
    -- These elements are located between pivotPos+1 and right, inclusively
    if (pivotPos + 1 < m + p AND right >= m AND pivotPos < right)
        PaginateRecursive(a, pivotPos + 1, right, m, p)

end

The main function is Paginate – it receives the array, its length and information about the page requested. But this function merely delegates the call to the recursive implementation (PaginateRecursive function), which in turn receives the bounds within the array which are currently in scope. This function relies on the Partition function from QUICKSORT algorithm, for which any implementation will suffice. For example, partitioning could be implemented like this:

function Partition(a, left, right)
a – unsorted array
left – index of the first element of the range to partition
right – index of the last element of the range to partition
begin

    pivot = a[left]
    pos = left + 1

    while pos <= right
        begin
            if a[pos] < pivot then
                begin
                    a[left] = a[pos]
                    left = left + 1
                    pos = pos + 1
                end
            else
                begin
                    tmp = a[right]
                    a[right] = a[pos]
                    a[pos] = tmp
                    right = right - 1
                end
        end

    a[left] = pivot

    return left

end

In this implementation the first element in the range is picked up to be the pivot. After that, all elements that are strictly less than the pivot are moved towards the beginning of the range and all elements greater or equal to the pivot are moved towards the end of the range. Finally, pivot is placed in between and its new position is returned from the function.

Note that this implementation of the partitioning function is not very efficient. It may lead to high worst case running time in case when pivot falls to one or the end of the partitioned range. Readers are advised to provide the Partition function implementation with better pivot selection. Anyway, with this function in place, we are ready to provide the complete pagination solution.

Implementation

Below is implementation for the Paginate function and the two helper functions it relies on.

void Paginate(int[] a, int m, int p)
{
    PaginateRecursive(a, 0, a.Length - 1, m, p);
}

void PaginateRecursive(int[] a, int left, int right, int m, int p)
{

    int pivotPos = Partition(a, left, right);

    if (left < m + p && pivotPos - 1 >= m && pivotPos > left)
        PaginateRecursive(a, left, pivotPos - 1, m, p);

    if (pivotPos + 1 < m + p && right >= m && pivotPos < right)
        PaginateRecursive(a, pivotPos + 1, right, m, p);

}

int Partition(int[] a, int left, int right)
{

    int pivot = a[left];
    int pos = left + 1;

    while (pos <= right)
    {
        if (a[pos] < pivot)
        {
            a[left++] = a[pos++];
        }
        else
        {
            int tmp = a[right];
            a[right] = a[pos];
            a[pos] = tmp;
            right--;
        }
    }

    a[left] = pivot;

    return left;

}

Demonstration

In this section we are providing the complete source code of a console application which demonstrates using the Paginate function.

using System;

namespace Pagination
{
    class Program
    {

        private static Random _rnd = new Random();

        public static void ShuffleArray(int[] a)
        {
            for (int i = a.Length - 1; i > 0; i--)
            {
                int index = _rnd.Next(i + 1);
                int tmp = a[i];
                a[i] = a[index];
                a[index] = tmp;
            }
        }

        public static int[] GenerateArray(int n)
        {

            int[] a = new int[n];

            for (int i = 0; i < n; i++)
                a[i] = i + 1;

            ShuffleArray(a);

            return a;

        }

        public static void Paginate(int[] a, int m, int p) { ... }

        public static void PaginateRecursive(int[] a, int left, int right,
                                             int m, int p) { ... }

        public static int Partition(int[] a, int left, int right) { ... }

        public static void PrintArray(int[] a, int m, int p)
        {

            for (int i = 0; i < a.Length; i++)
                Console.Write("{0}{1,2}{2}",
                                i == m ? "[" : " ",
                                a[i],
                                i == m + p - 1 ? "]" : " ");

            if (m + p > a.Length)
                Console.Write("]");

            Console.WriteLine();

        }

        static void Main(string[] args)
        {

            while (true)
            {

                Console.Write("n=");
                int n = int.Parse(Console.ReadLine());

                if (n <= 0)
                    break;

                Console.Write("m=");
                int m = int.Parse(Console.ReadLine());

                Console.Write("p=");
                int p = int.Parse(Console.ReadLine());

                int[] a = GenerateArray(n);

                Paginate(a, m, p);

                PrintArray(a, m, p);

            }

            Console.Write("Press ENTER to continue... ");
            Console.ReadLine();

        }

    }
}

When application is run, it produces output like this:

n=10
m=3
p=3
  1   3   2 [ 4   5   6]  7   8  10   9
n=15
m=8
p=4
  4   3   2   1   5   6   7   8 [ 9  10  11  12] 13  15  14
n=20
m=3
p=3
  1   3   2 [ 4   5   6]  7   8   9  10  11  12  20  15  16  19  14  17  18  13
n=0
Press ENTER to continue...

Follow-Up Exercises

Readers are advised to modify existing solution and implement a function which sorts next page in a sequence, assuming that all previous pages, i.e. pages with smaller starting positions, are already sorted. Write a function with this signature:

void PaginateNext(int[] a, int m, int p)

All elements of the array in range 0 to m-1 are already sorted. On output, function should ensure that all elements on positions 0 to m+p-1 are sorted.

See also:

Published: May 24, 2013

ZORAN HORVAT

Zoran is software architect dedicated to clean design and CTO in a growing software company. Since 2014 Zoran is an author at Pluralsight where he is preparing a series of courses on design patterns, writing unit and integration tests and applying methods to improve code design and long-term maintainability.

Follow him on Twitter @zoranh75 to receive updates and links to new articles.

Watch Zoran's video courses at pluralsight.com (requires registration):

Tactical Design Patterns in .NET: Managing Responsibilities

Applying a design pattern to a real-world problem is not as straightforward as literature implicitly tells us. It is a more engaged process. This course gives an insight into tactical decisions we need to make when applying design patterns that have to do with separating and implementing class responsibilities. More...

Tactical Design Patterns in .NET: Control Flow

Improve your skills in writing simpler and safer code by applying coding practices and design patterns that are affecting control flow. More...

Improving Testability Through Design

This course tackles the issues of designing a complex application so that it can be covered with high quality tests. More...

Share this article

webmasters