Merge K sorted arrays of size n using merge algorithm from mergeSort

Question

Problem: Given K sorted arrays of size N each, merge them and print the sorted output.

Sample Input-1:

K = 3, N =  4

arr[][] = { {1, 3, 5, 7},

            {2, 4, 6, 8},

            {0, 9, 10, 11}} ;


Sample Output-1: 

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

I know there is a way to do this problem using a priority queue/min heap, but I want to do it using the merge procedure from mergeSort. The idea seems straightforward enough...at each iteration, merge the remaining arrays in groups of two, such that the number of arrays gets halved at each iteration.

However, whenever halving leads to an odd number, this becomes problematic. My idea is that whenever halving leads to an odd number, we take care of the extra array by merging it with the array formed from the last merge.

The code I have so far is below. This only works on one out of 30 test cases, however:

static int[] mergeArrays(int[][] arr) {
        int k = arr.length;
        int n = arr[0].length;
        if(k < 2){
            return arr[0];
        }

        boolean odd_k;
        if(k%2){
            odd_k = false;
        }
        else{
            odd_k = true;
        }

        while(k > 1){
            int o;
            if(odd_k){
                o = (k/2) + 1;
            }
            else{
                o = k/2;
            }
            int[][] out = new int[o][];

            for(int i=0; i < k; i = i + 2){
                int[] a;
                int[] b;
                if(odd_k && i == (k-1)){
                    b = arr[i];
                    b = out[i-1];
                }
                else{
                    a = arr[i];
                    b = arr[i+1];
                }
                out[i] = mergeTwo(a, b);
            }
            k = k/2;
            if(k % 2 == 0){
                odd_k = false;
            }
            else{
                odd_k = true;
            }

            arr = out;
        }
        return arr[0];

    }

    static int[] mergeTwo(int[] a, int[] b){
        int[] c = new int[a.length + b.length];
        int i, j, k;
        i = j = k = 0;
       while(i < a.length && j < b.length){
           if(a[i] < b[j]){
               c[k] = a[i];
               i++;
               k++;
           }
           else{
               c[k] = b[j];
               j++; k++;
            }
       }
       if(i < a.length){
           while(i < a.length){
               c[k] = a[i];
               i++; k++;
           }
       }
       if(j < b.length){
           while(j < b.length){
               c[k] = b[j];
               j++; k++;
           }
       }
       return c;
    }

if(k%2) and b = out[i-1] are probably what you did wrong. — Nelfeal
– Nelfeal, Commented Dec 23, 2018 at 7:28
You can merge more than two arrays at a time if you want. Of course it will require an index into each array and a loop to find the next element to put into the merged result. — Anonymous
– Anonymous, Commented Dec 23, 2018 at 7:36
@OleV.V. - For a non-external sort, there is little benefit from doing more than a 2 way merge. See my comment to Tharaka Ratnayake answer. — rcgldr
– rcgldr, Commented Dec 23, 2018 at 20:52

Elliott Frisch · Accepted Answer · 2018-12-23 07:33:48Z

1

We can shorten your mergeTwo implementation,

static int[] mergeTwo(int[] a, int[] b) {
    int[] c = new int[a.length + b.length];
    int i = 0, j = 0, k = 0; // declare and initialize on one line
    while (i < a.length && j < b.length) {
        if (a[i] <= b[j]) {
            c[k++] = a[i++]; // increment and assign
        } else {
            c[k++] = b[j++]; // increment and assign
        }
    }
    // No need for extra if(s)
    while (i < a.length) {
        c[k++] = a[i++];
    }
    while (j < b.length) {
        c[k++] = b[j++];
    }
    return c;
}

And we can then fix your mergeArrays and shorten it by starting with the first row from the int[][] and then using mergeTwo to concatenate the arrays iteratively. Like,

static int[] mergeArrays(int[][] arr) {
    int[] t = arr[0];
    for (int i = 1; i < arr.length; i++) {
        t = mergeTwo(t, arr[i]);
    }
    return t;
}

I then tested it with

public static void main(String[] args) {
    int arr[][] = { { 1, 3, 5, 7 }, { 2, 4, 6, 8 }, { 0, 9, 10, 11 } };
    System.out.println(Arrays.toString(mergeArrays(arr)));
}

And I get (as expected)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

answered Dec 23, 2018 at 7:33

Elliott Frisch

202k20 gold badges166 silver badges265 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

ohbrobig Over a year ago

That was my initial implementation but if you look at that closely it is not efficient. It merges the output of a merge with the next, so it would be O(k2n). Trying to do it in groups of two, so its O(kn log k).

rcgldr Over a year ago

@ohbrobig - despite the better time complexity, ignoring loop overhead, either approach is about the same number of compares and moves. For your suggested approach, for each pass, you could create a matrix[(arr.length+1)/2][arr[0].length*2], merge even and odd rows (or copy a row if there is no row to merge it with (the odd number of rows case)) into that matrix, then repeat using that matrix as input for the next pass .

Tharaka Ratnayake · Accepted Answer · 2018-12-23 07:34:39Z

0

As you say you have merged two arrays at a time. As it is inefficient you can merge all subarrays same time. What you have to do is to find the minimum from every subarray and remember the position of that element.

To do that we can use another array (say curPos) to remember the current position

 private int[] merge(int[][] arr) 
{
    int K = arr.length;
    int N = arr[0].length;

    /** array to keep track of non considered positions in subarrays **/
    int[] curPos = new int[K];

    /** final merged array **/
    int[] mergedArray = new int[K * N];
    int p = 0;

    while (p < K * N)
    {
        int min = Integer.MAX_VALUE;
        int minPos = -1;
        /** search for least element **/
        for (int i = 0; i < K; i++)
        {
            if (curPos[i] < N)
            {
                if (arr[i][curPos[i]] < min)
                {
                    min = arr[i][curPos[i]];
                    minPos = i;
                }
            }                
        }
        curPos[minPos]++;            
        mergedArray[p++] = min;
    }
    return mergedArray;

answered Dec 23, 2018 at 7:34

Tharaka Ratnayake

541 silver badge9 bronze badges

2 Comments

rcgldr Over a year ago

For a non-external sort, there is little benefit from doing more than a 2 way merge. The number of operations is about the same. For example, in the typical inner loop, 2 way merge does 1 compare and 1 move for each element moved, while 4 way merge does 3 compares and 1 move for each element moved, but only does 1/2 the number of moves, so 4 way uses 1/2 times the number of moves, but 3/2 times the number of compares, about the same number of operations. If using a heap for a k-way merge, the overhead of the heap slows it down.

Jim Mischel Over a year ago

Your technique of searching for the smallest element is very inefficient. You're doing a linear search for each element. The result is that your algorithm is O(k * n * k), whereas the OP's algorithm is O(k * n * log(k)). If you want to merge all of the lists at the same time, then you need to use a more efficient priority queue. But as @rcgldr pointed out in his comment, the heap overhead slows things down. For internal merges, the only time the heap-based k-way merge outperforms pairwise merging is when some lists are much (orders of magnitude) longer than others.

Jim Mischel · Accepted Answer · 2018-12-24 17:36:39Z

0

Probably the easiest way to handle this is to use a queue of arrays. Initially, add all the arrays to the queue. Then, remove the first two arrays from the queue, merge them, and add the resulting array to the queue. Continue doing that until there is only one array in the queue. Something like:

for each array in list of arrays
    queue.push(array)

while queue.length > 1
    a1 = queue.pop()
    a2 = queue.pop()
    a3 = merge(a1, a2)
    queue.push(a3)

result = queue.pop()

That simplifies things quite a bit, and the problem of "halving" goes away.

answered Dec 24, 2018 at 17:36

Jim Mischel

135k25 gold badges197 silver badges377 bronze badges

Collectives™ on Stack Overflow

Merge K sorted arrays of size n using merge algorithm from mergeSort

3 Answers 3

2 Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related