http://www.codinghelmet.com/  

Wear a helmet. Even when coding.

exercises > counting-intersection-of-two-unsorted-arrays

Exercise #31: Counting Intersection of Two Unsorted Arrays
by Zoran Horvat @zoranh75

Problem Statement

Given two unsorted arrays of integer numbers, write a function which returns number of elements that exist in both arrays. Elements may be repeated in any of the arrays, and all common appearances should be counted.

Example: Suppose that the first array is 3, 1, 4, 1, 2 and the second array is 5, 1, 2, 4, 2, 1. Common elements of the two arrays are 1, 1 and 2, and therefore the function should return result 3.

Problem Analysis

In this problem, we have to traverse the two arrays and identify elements that are in common. This exercise is closely connected with Finding Intersection of Two Unsorted Arrays, in which we were actually constructing the array containing the common elements.

In that exercise, we have discussed several approaches:

  • Indexing solution – Pass through one array and put all its elements into a hash table. Then iterate through the other array and see if any element can be found in the hash table. If so, remove it from the hash table (to avoid future hits) and push it to the output.
  • Brute force solution – Compare any two elements of both arrays, while ensuring that elements that take part in any matching are not considered again.
  • Sorting solution – Sort both arrays and then progress through them and extract common elements.
  • Partial sorting solution – Divide and conquer both arrays around the same pivot values and then push common pivot occurrences to the output.

For more details about each of these methods, please refer to Finding Intersection of Two Unsorted Arrays exercise. In this article, we will provide solution which is based on partial sorting.

Partial Sorting Solution

As already indicated, this solution is based on divide-and-conquer idea. We would find a pivot in the first array and then partition the array around that pivot value. Then we would partition the second array around the same pivot value. In this way, we would split both arrays around the same value. The whole idea boils down to this sequence of steps:

1. Partition both arrays around the same pivot value

2. Count how many occurrences of the pivot value they have in common

3. Repeat recursively on left and on right partitions of both arrays, if non-empty

Once recursion unfolds, all calls will report how many pivot values they have in common. Overall result is then the sum of all partial results.

Below is the pseudo-code which implements this idea.

function CountIntersection(a, b, n, m)
    -- a - array
    -- b - array
    -- n - length of a
    -- m - length of b
begin
    return CountIntersectionRecursive(a, b, 1, n, 1, m)
end

function CountIntersectionRecursive(a, b, aStart, aEnd, bStart, bEnd)
    -- a - array
    -- b - array
    -- aStart - first position in a to intersect
    -- aEnd - last position in a to intersect
    -- bStart - first position in b to intersect
    -- bEnd - last position in b to intersect
begin

    pivot = MedianOfMedians(a, aStart, aEnd);

    Partition(a, aStart, aEnd, pivot, out aPivotStart, out aPivotLen)
    Partition(b, bStart, bEnd, pivot, out bPivotStart, out bPivotLen)

    count = min(aPivotLen, bPivotLen)

    -- Recursively solve left partitions
    if aPivotStart > aStart AND bPivotStart > bStart then
        count = count +
            CountIntersectionRecursive(a, b, aStart, aPivotStart - 1,
                                       bStart, bPivotStart - 1)

    -- Recursively solve right partitions
    if aPivotStrat + aPivotLen < aEnd AND bPivotStart + bPivotLen < bEnd then
        count = count +
            CountIntersectionRecursive(a, b, aPivotStart + aPivotLen, aEnd,
                                       bPivotStart + bPivotLen, bEnd)
    return count

end

Partition is a utility function which partitions the array around specified pivot value. MedianOfMedians is another utility function which calculates convenient pivot value using median of medians algorithm. You can read more about this algorithm in exercise Finding Kth Smallest Element in an Unsorted Array.

Implementation

Below is the complete Console application in C# which implements the intersection counting function.

using System;

namespace IntersectionCount
{

    public class Program
    {

        static int IntersectionCount(int[] a, int[] b)
        {
            return IntersectionCountRecursive(a, b,
                                              0, a.Length - 1,
                                              0, b.Length - 1);

        }

        static int IntersectionCountRecursive(int[] a, int[] b,
                                         int aStart, int aEnd,
                                         int bStart, int bEnd)
        {

            int pivot = MedianOfMedians(a, aStart, aEnd);

            int aPivotStart;
            int aPivotLen;
            int bPivotStart;
            int bPivotLen;

            Partition(a, aStart, aEnd, pivot,
                      out aPivotStart, out aPivotLen);

            Partition(b, bStart, bEnd, pivot,
                      out bPivotStart, out bPivotLen);

            int count = aPivotLen;
            if (bPivotLen < count)
                count = bPivotLen;

            // Recursively solve left partitions
            if (aPivotStart > aStart && bPivotStart > bStart)
                count += IntersectionCountRecursive(a, b,
                                                    aStart, aPivotStart - 1,
                                                    bStart, bPivotStart - 1);

            // Recursively solve right partitions
            if (aPivotStart + aPivotLen - 1 < aEnd &&
                bPivotStart + bPivotLen - 1 < bEnd)
                count += IntersectionCountRecursive(a, b,
                                                    aPivotStart + aPivotLen, aEnd,
                                                    bPivotStart + bPivotLen, bEnd);

            return count;

        }

        static void Partition(int[] a, int start, int end, int pivot,
                              out int pivotStart, out int pivotLen)
        {

            int i = start;

            while (i <= end)
            {

                if (a[i] < pivot)
                {
                    a[start++] = a[i++];
                }
                else if (a[i] == pivot)
                {
                    i++;
                }
                else
                {
                    int tmp = a[end];
                    a[end] = a[i];
                    a[i] = tmp;
                    end--;
                }

            }

            for (i = start; i <= end; i++)
                a[i] = pivot;

            pivotStart = start;
            pivotLen = end - start + 1;

        }

        static int MedianOfMedians(int[] a, int start, int end)
        {

            while (end > start + 1)
            {

                int pos = start;

                for (int segment = start; segment < end; segment += 5)
                {

                    int tmp;

                    int segmentEnd = segment + 5;
                    if (segmentEnd > end)
                        segmentEnd = end;

                    for (int i = segment; i < segmentEnd - 1; i++)
                        for (int j = i + 1; j < segmentEnd; j++)
                            if (a[i] > a[j])
                            {
                                tmp = a[i];
                                a[i] = a[j];
                                a[j] = tmp;
                            }

                    int middlePos = (segment + segmentEnd - 1)  2;
                    tmp = a[pos];
                    a[pos] = a[middlePos];
                    a[middlePos] = tmp;
                    pos++;

                }

                end = pos;

            }

            return a[start];

        }

        static int[] ReadArray()
        {

            Console.Write("Enter array elements (ENTER to exit): ");

            string[] elements =
                Console
                .ReadLine()
                .Split(new char[] { ' ' },
                       StringSplitOptions.RemoveEmptyEntries);

            int[] a = null;

            if (elements.Length > 0)
            {
                a = new int[elements.Length];
                for (int i = 0; i < elements.Length; i++)
                    a[i] = int.Parse(elements[i]);
            }

            return a;

        }

        static void Main(string[] args)
        {

            while (true)
            {

                int[] a = ReadArray();
                if (a == null)
                    break;

                int[] b = ReadArray();
                if (b == null)
                    break;

                int intersectionCount = IntersectionCount(a, b);

                Console.WriteLine("Arrays share {0} element(s).\n",
                                  intersectionCount);

            }

        }

    }
}

Demonstration

When Console application shown above is run and tried on a few arrays, it produces the following output:

Enter array elements (ENTER to exit): 3 1 4 1 2
Enter array elements (ENTER to exit): 5 1 2 4 2
Arrays share 3 element(s).

Enter array elements (ENTER to exit): 2 5 1 2 7 3
Enter array elements (ENTER to exit): 9 8 7 6 5 4
Arrays share 2 element(s).

Enter array elements (ENTER to exit): 5 5 5 5 5
Enter array elements (ENTER to exit): 4 4 4 4 4
Arrays share 0 element(s).

Enter array elements (ENTER to exit):

Discussion

Good thing about solution presented in this article is that analysis of one partition of an array terminates if the corresponding partition in the other array is empty. Searching for new common elements between the two arrays progresses only as long as there is hope to find more of them. This measure makes positive effect on the efficiency of the solution.

See also:

Published: Sep 17, 2015; Modified: Sep 16, 2015

ZORAN HORVAT

Zoran is software architect dedicated to clean design and CTO in a growing software company. Since 2014 Zoran is an author at Pluralsight where he is preparing a series of courses on object-oriented and functional design, design patterns, writing unit and integration tests and applying methods to improve code design and long-term maintainability.

Follow him on Twitter @zoranh75 to receive updates and links to new articles.

Watch Zoran's video courses at pluralsight.com (requires registration):

Making Your C# Code More Object-Oriented

This course will help leverage your conceptual understanding to produce proper object-oriented code, where objects will completely replace procedural code for the sake of flexibility and maintainability. More...

Advanced Defensive Programming Techniques

This course will lead you step by step through the process of developing defensive design practices, which can substitute common defensive coding, for the better of software design and implementation. More...

Tactical Design Patterns in .NET: Creating Objects

This course sheds light on issues that arise when implementing creational design patterns and then provides practical solutions that will make our code easier to write and more stable when running. More...

Tactical Design Patterns in .NET: Managing Responsibilities

Applying a design pattern to a real-world problem is not as straight-forward as literature implicitly tells us. It is a more engaged process. This course gives an insight to tactical decisions we need to make when applying design patterns that have to do with separating and implementing class responsibilities. More...

Tactical Design Patterns in .NET: Control Flow

Improve your skills in writing simpler and safer code by applying coding practices and design patterns that are affecting control flow. More...

Writing Highly Maintainable Unit Tests

This course will teach you how to develop maintainable and sustainable tests as your production code grows and develops. More...

Improving Testability Through Design

This course tackles the issues of designing a complex application so that it can be covered with high quality tests. More...

Share this article

webmasters