2

Suppose I have the following setup:

int[] vectorUsedForSorting = new int[] { 1,0,2,6,3,4,5 }
int[] vectorToBeSorted = new int[] {1,2,3,4,5,6,7}

What is the most efficient/fast way to sort vectorToBeSorted by using vectorUsedForSorting? For example, I would want vectorToBeSorted[0] to become vectorToBeSorted[1], since the first element of vectorUsedForSorting is 1 (i.e., vectorToBeSorted[0] should become `vectorToBeSorted[vectorUsedForSorting[0]], etc).

I am aiming for vectorToBeSorted to be [2,1,3,5,6,7,4] after the sorting algorithm is complete.

I am hoping to achieve something very fast. Note that computational complexity should be the main focus, since I will be sorting arrays of size 1,000,000 and more.

I am aiming for sub-linear time complexity if this is possible.

10
  • Is vectorUsedForSorting already a listing of the future array positions as in your example? Or can there be arbitrary numbers in there like { 0, 3, 4, 7}? Commented Feb 6, 2014 at 16:40
  • My guts feeling is that you won't be able to make something very fast since you can't rely on the comparison operators as you are not achieving a classic sorting... The way I see you could do this is create an object that contains the final index to be assigned to this object plus the value. This means that the object that has the value 0 would have the index 1. Commented Feb 6, 2014 at 16:40
  • @Johnride Hmm could you please refactor the sentence The way I see you could do this is create an object that contains the final index to be assigned to this object plus the value. , can't parse it. Commented Feb 6, 2014 at 16:42
  • Oh I understand better with your edit. A simple for loop would do the trick then. Commented Feb 6, 2014 at 16:42
  • 1
    @user2763361 Well, to get sub-linear my first idea could work. I'll try to make it more clear here : Your vectorUsedForSorting and vectorTBeSorted would see thier values matched in a simple object (let's say sortableObj) that contains the index field and the value field. That would create an array of sortableObj looking like this : [{index : 1, value : 1}, {index : 0, value : 2}, {index : 2, value : 3}...] Where each pair of curly braces represent an object. After then you can perform a quickSort on the index value. Commented Feb 6, 2014 at 16:50

4 Answers 4

5

There are two ways to attack this. The first one is to copy a fast sort algorithm and change the access and swap-values parts with something that can handle the indirection you have:

int valueAt(int index) { return vectorUsedForSorting[index]; }
int swap(int i1, int i2) {
    int tmp = vectorUsedForSorting[i1];
    vectorUsedForSorting[i1] = vectorUsedForSorting[i2];
    vectorUsedForSorting[i2] = tmp;

    tmp = vectorToBeSorted[i1];
    vectorToBeSorted[i1] = vectorToBeSorted[i2];
    vectorToBeSorted[i2] = tmp;
}

The second approach is to copy the values into a new object:

public class Item {
    int index;
    int value;
}

Create an array of those and populate it with Items created with the values from both arrays. You can then create a Comparator<Item> which compares them by index.

When you have this, you can sort the array with Arrays.sort(items, comparator).

If that's not fast enough, then you can create N threads and have each thread sort 1/N-th of the original array. When that's done, you use the merge step from merge sort to join the results.

Sign up to request clarification or add additional context in comments.

8 Comments

Using arrays.sort won't be linear though. Since OP already knows the final positions of the elements, it's faster to just make one pass through and put them in their final location (in a new array). That will be linear.
@JoshuaTaylor: OP doesn't know the final positions of the elements as per the answer of my question. My feeling is that most answers so far (including the accepted one) are wrong.
@AaronDigulla I'm not sure I follow. It appears that the elements of vectorUsedForSorting (i.e., { 1,0,2,6,3,4,5 }) are integers from 0 to n-1 (where there are n elements). If those are used as the sorting keys, then then element whose index is i is going to end up at position i in the sorted array. If vectorUsedForSorting were something else, (e.g., { 0, 3, 4, 7}, as @Sirko asked in a comment, this would no longer be the case. However, OP explicitly mentions that “vectorToBeSorted[0] should become vectorToBeSorted[vectorUsedForSorting[0]], etc.” If the elements of…
vectorUsedForSorting can't be used as indices, then I entirely agreee with using Arrays.sort and a comparator. If they can be used as indices, though, then the simple solution is linear, and Arrays.sort is unnecessary.
@JoshuaTaylor: I asked the same thing and got the opposite answer. My current impression is that OP doesn't really know what the problem is.
|
3

When performance is an issue, and the arrays are large, you at least have to consider a parallel implementation (especially since this problem is embarassingly parallel: It's not much effort and should yield a nice, near-linear speedup with an increasing number of cores) :

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class ArrayReordering
{
    public static void main(String[] args)
    {
        basicTest();
        performanceTest();
    }

    private static void basicTest()
    {
        int[] vectorUsedForSorting = new int[] { 1,0,2,6,3,4,5 };
        int[] vectorToBeSorted = new int[] {1,2,3,4,5,6,7};      
        int[] sortedVectorLinear = new int[vectorToBeSorted.length];
        int[] sortedVectorParallel = new int[vectorToBeSorted.length];

        sortLinear(vectorUsedForSorting, vectorToBeSorted, sortedVectorLinear);
        sortParallel(vectorUsedForSorting, vectorToBeSorted, sortedVectorParallel);

        System.out.println("Result Linear   "+Arrays.toString(sortedVectorLinear));
        System.out.println("Result Parallel "+Arrays.toString(sortedVectorParallel));
    }

    private static void performanceTest()
    {
        for (int n=1000000; n<=50000000; n*=2)
        {
            System.out.println("Run with "+n+" elements");

            System.out.println("Creating input data");
            int vectorUsedForSorting[] = createVectorUsedForSorting(n);
            int vectorToBeSorted[] = new int[n];
            for (int i=0; i<n; i++)
            {
                vectorToBeSorted[i] = i;
            }
            int[] sortedVectorLinear = new int[vectorToBeSorted.length];
            int[] sortedVectorParallel = new int[vectorToBeSorted.length];

            long before = 0;
            long after = 0;

            System.out.println("Running linear");
            before = System.nanoTime();
            sortLinear(vectorUsedForSorting, vectorToBeSorted, sortedVectorLinear);
            after = System.nanoTime();
            System.out.println("Duration linear   "+(after-before)/1e6+" ms");

            System.out.println("Running parallel");
            before = System.nanoTime();
            sortParallel(vectorUsedForSorting, vectorToBeSorted, sortedVectorParallel);
            after = System.nanoTime();
            System.out.println("Duration parallel "+(after-before)/1e6+" ms");

            //System.out.println("Result Linear   "+Arrays.toString(sortedVectorLinear));
            //System.out.println("Result Parallel "+Arrays.toString(sortedVectorParallel));
            System.out.println("Passed linear?   "+
                Arrays.equals(vectorUsedForSorting, sortedVectorLinear));
            System.out.println("Passed parallel? "+
                Arrays.equals(vectorUsedForSorting, sortedVectorParallel));
        }
    }

    private static int[] createVectorUsedForSorting(int n)
    {
        // Not very elegant, just for a quick test...
        List<Integer> indices = new ArrayList<Integer>();
        for (int i=0; i<n; i++)
        {
            indices.add(i);
        }
        Collections.shuffle(indices);
        int vectorUsedForSorting[] = new int[n];
        for (int i=0; i<n; i++)
        {
            vectorUsedForSorting[i] = indices.get(i);
        }
        return vectorUsedForSorting;
    }

    private static void sortLinear(
        int vectorUsedForSorting[], int vectorToBeSorted[], 
        int sortedVector[])
    {
        sortLinear(vectorUsedForSorting, vectorToBeSorted, 
            sortedVector, 0, vectorToBeSorted.length);
    }

    static void sortParallel(
        final int vectorUsedForSorting[], final int vectorToBeSorted[], 
        final int sortedVector[])
    {
        int numProcessors = Runtime.getRuntime().availableProcessors();
        int chunkSize = (int)Math.ceil((double)vectorToBeSorted.length / numProcessors);
        List<Callable<Object>> tasks = new ArrayList<Callable<Object>>();
        ExecutorService executor = Executors.newFixedThreadPool(numProcessors);
        for (int i=0; i<numProcessors; i++)
        {
            final int min = i * chunkSize;
            final int max = Math.min(vectorToBeSorted.length, min + chunkSize);
            Runnable task = new Runnable()
            {
                @Override
                public void run()
                {
                    sortLinear(vectorUsedForSorting, vectorToBeSorted, 
                        sortedVector, min, max);
                }
            };
            tasks.add(Executors.callable(task));
        }
        try
        {
            executor.invokeAll(tasks);
        }
        catch (InterruptedException e)
        {
            Thread.currentThread().interrupt();
        }
        executor.shutdown();
    }

    private static void sortLinear(
        int vectorUsedForSorting[], int vectorToBeSorted[], 
        int sortedVector[], int min, int max)
    {
        for (int i = min; i < max; i++)
        {
            sortedVector[i] = vectorToBeSorted[vectorUsedForSorting[i]];
        }          
    }

}

Comments

3

You could create a new array, perform the sorting on that array and set vectorToBeSorted to be the new array.

int size = vectorToBeSorted.length;
int[] array = new int[size];
for (int i = 0; i < size; ++i)
    array[vectorUsedForSorting[i]] = vectorToBeSorted[i];
vectorToBeSorted = array;

EDIT

If you wanted to be able to sort in place, you would need to loop through, swapping the appropriate values.

int size = vectorToBeSorted.length;
for (int i = 0; i < size; ++i) {
    int index = vectorUsedForSorting[i];
    int value = vectorToBeSorted[index];

    vectorUsedForSorting[i] = vectorUsedForSorting[index];
    vectorToBeSorted[index] = vectorToBeSorted[i];

    vectorUsedForSorting[index] = index;
    vectorToBeSorted[i] = value;
}

If you are able to create a pair structure that compares on indexes. You could use a sort; however, sorts are definitely slower than a linear solution.

In this case, these two statements are equivalent.

array[vectorUsedForSorting[i]] = vectorToBeSorted[i];
array[i] = vectorToBeSorted[vectorUsedForSorting[i]];

Comments

2

How about:

int size = size(vectorUsedForSorting); 
int [] sortedVector = new int[size];
for (int i = 0; i < size; ++i)
{
    sortedVector[i] = vectorToBeSorted[vectorUsedForSorting[i]];
}  

Or does it have to be in place sorting?

5 Comments

What do you mean by in place sorting?
I meant without creating a third array.
The solution of hsun324 seems pretty good. But if memory does not matter, the solution with a 3rd array will be faster.
OP said that computational complexity was the important thing. The memory allocation will take a bit of time, sure, but the runtime here is just one pass through the array. It's linear.
So you would agree with my solution?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.