0

Problem: Given K sorted arrays of size N each, merge them and print the sorted output.

Sample Input-1:

K = 3, N =  4

arr[][] = { {1, 3, 5, 7},

            {2, 4, 6, 8},

            {0, 9, 10, 11}} ;


Sample Output-1: 

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

I know there is a way to do this problem using a priority queue/min heap, but I want to do it using the merge procedure from mergeSort. The idea seems straightforward enough...at each iteration, merge the remaining arrays in groups of two, such that the number of arrays gets halved at each iteration.

However, whenever halving leads to an odd number, this becomes problematic. My idea is that whenever halving leads to an odd number, we take care of the extra array by merging it with the array formed from the last merge.

The code I have so far is below. This only works on one out of 30 test cases, however:

static int[] mergeArrays(int[][] arr) {
        int k = arr.length;
        int n = arr[0].length;
        if(k < 2){
            return arr[0];
        }

        boolean odd_k;
        if(k%2){
            odd_k = false;
        }
        else{
            odd_k = true;
        }

        while(k > 1){
            int o;
            if(odd_k){
                o = (k/2) + 1;
            }
            else{
                o = k/2;
            }
            int[][] out = new int[o][];

            for(int i=0; i < k; i = i + 2){
                int[] a;
                int[] b;
                if(odd_k && i == (k-1)){
                    b = arr[i];
                    b = out[i-1];
                }
                else{
                    a = arr[i];
                    b = arr[i+1];
                }
                out[i] = mergeTwo(a, b);
            }
            k = k/2;
            if(k % 2 == 0){
                odd_k = false;
            }
            else{
                odd_k = true;
            }

            arr = out;
        }
        return arr[0];

    }

    static int[] mergeTwo(int[] a, int[] b){
        int[] c = new int[a.length + b.length];
        int i, j, k;
        i = j = k = 0;
       while(i < a.length && j < b.length){
           if(a[i] < b[j]){
               c[k] = a[i];
               i++;
               k++;
           }
           else{
               c[k] = b[j];
               j++; k++;
            }
       }
       if(i < a.length){
           while(i < a.length){
               c[k] = a[i];
               i++; k++;
           }
       }
       if(j < b.length){
           while(j < b.length){
               c[k] = b[j];
               j++; k++;
           }
       }
       return c;
    }
3
  • if(k%2) and b = out[i-1] are probably what you did wrong. Commented Dec 23, 2018 at 7:28
  • You can merge more than two arrays at a time if you want. Of course it will require an index into each array and a loop to find the next element to put into the merged result. Commented Dec 23, 2018 at 7:36
  • @OleV.V. - For a non-external sort, there is little benefit from doing more than a 2 way merge. See my comment to Tharaka Ratnayake answer. Commented Dec 23, 2018 at 20:52

3 Answers 3

1

We can shorten your mergeTwo implementation,

static int[] mergeTwo(int[] a, int[] b) {
    int[] c = new int[a.length + b.length];
    int i = 0, j = 0, k = 0; // declare and initialize on one line
    while (i < a.length && j < b.length) {
        if (a[i] <= b[j]) {
            c[k++] = a[i++]; // increment and assign
        } else {
            c[k++] = b[j++]; // increment and assign
        }
    }
    // No need for extra if(s)
    while (i < a.length) {
        c[k++] = a[i++];
    }
    while (j < b.length) {
        c[k++] = b[j++];
    }
    return c;
}

And we can then fix your mergeArrays and shorten it by starting with the first row from the int[][] and then using mergeTwo to concatenate the arrays iteratively. Like,

static int[] mergeArrays(int[][] arr) {
    int[] t = arr[0];
    for (int i = 1; i < arr.length; i++) {
        t = mergeTwo(t, arr[i]);
    }
    return t;
}

I then tested it with

public static void main(String[] args) {
    int arr[][] = { { 1, 3, 5, 7 }, { 2, 4, 6, 8 }, { 0, 9, 10, 11 } };
    System.out.println(Arrays.toString(mergeArrays(arr)));
}

And I get (as expected)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
Sign up to request clarification or add additional context in comments.

2 Comments

That was my initial implementation but if you look at that closely it is not efficient. It merges the output of a merge with the next, so it would be O(k2n). Trying to do it in groups of two, so its O(kn log k).
@ohbrobig - despite the better time complexity, ignoring loop overhead, either approach is about the same number of compares and moves. For your suggested approach, for each pass, you could create a matrix[(arr.length+1)/2][arr[0].length*2], merge even and odd rows (or copy a row if there is no row to merge it with (the odd number of rows case)) into that matrix, then repeat using that matrix as input for the next pass .
0

As you say you have merged two arrays at a time. As it is inefficient you can merge all subarrays same time. What you have to do is to find the minimum from every subarray and remember the position of that element.


To do that we can use another array (say curPos) to remember the current position

 private int[] merge(int[][] arr) 
{
    int K = arr.length;
    int N = arr[0].length;

    /** array to keep track of non considered positions in subarrays **/
    int[] curPos = new int[K];

    /** final merged array **/
    int[] mergedArray = new int[K * N];
    int p = 0;

    while (p < K * N)
    {
        int min = Integer.MAX_VALUE;
        int minPos = -1;
        /** search for least element **/
        for (int i = 0; i < K; i++)
        {
            if (curPos[i] < N)
            {
                if (arr[i][curPos[i]] < min)
                {
                    min = arr[i][curPos[i]];
                    minPos = i;
                }
            }                
        }
        curPos[minPos]++;            
        mergedArray[p++] = min;
    }
    return mergedArray;

2 Comments

For a non-external sort, there is little benefit from doing more than a 2 way merge. The number of operations is about the same. For example, in the typical inner loop, 2 way merge does 1 compare and 1 move for each element moved, while 4 way merge does 3 compares and 1 move for each element moved, but only does 1/2 the number of moves, so 4 way uses 1/2 times the number of moves, but 3/2 times the number of compares, about the same number of operations. If using a heap for a k-way merge, the overhead of the heap slows it down.
Your technique of searching for the smallest element is very inefficient. You're doing a linear search for each element. The result is that your algorithm is O(k * n * k), whereas the OP's algorithm is O(k * n * log(k)). If you want to merge all of the lists at the same time, then you need to use a more efficient priority queue. But as @rcgldr pointed out in his comment, the heap overhead slows things down. For internal merges, the only time the heap-based k-way merge outperforms pairwise merging is when some lists are much (orders of magnitude) longer than others.
0

Probably the easiest way to handle this is to use a queue of arrays. Initially, add all the arrays to the queue. Then, remove the first two arrays from the queue, merge them, and add the resulting array to the queue. Continue doing that until there is only one array in the queue. Something like:

for each array in list of arrays
    queue.push(array)

while queue.length > 1
    a1 = queue.pop()
    a2 = queue.pop()
    a3 = merge(a1, a2)
    queue.push(a3)

result = queue.pop()

That simplifies things quite a bit, and the problem of "halving" goes away.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.