4

My problem is the following: I need to shuffle an array and then get just the first N elements.

I am currently shuffling the whole array, that has 50+ elements, but this gives me performance problems, since the shuffling routine is called 1e+9 times.

I am currently implementing the Fisher-Yates algorithm to shuffle:

public static void shuffle(int[] array) {
    Random gen = new Random();
    for (int i = array.length - 1; i > 0; i--) {
        int index = gen.nextInt(i + 1);
        int a = array[index];
        array[index] = array[i];
        array[i] = a;
    }
}

Then I select just the first N elements.

I have also tried using Reservoir sampling, but it just saved me 1 second. That's not enough, since my program runs for 30 secs. Also, I might have implemented it incorrectly, because I am not getting the same results when compared to the Fisher-Yates algorithm. This is my implementation:

public static int[] shuffle(int[] array, int N) {
    int[] ret = new int[N];
    for (int i = 0; i < N; i++) {
        ret[i] = array[i];
    }
    Random gen = new Random();
    int j;
    for (int i = N; i < array.length; i++) {
        j = gen.nextInt(i+1);
        if (j <= N - 1)
            ret[j] = array[i];
    }
    return ret;
}

To conclude, what I need is a a shuffling algorithm that would pick N random elements using a search of length N, instead 50+. If not possible, something better then Fisher-Yates and Reservoir sampling.

Note-1: Altering the original "int[] array" is not a problem.

Note-2: N is usually around 10.

4
  • Possible duplicate of stackoverflow.com/questions/1519736/… Commented Jan 30, 2014 at 15:39
  • I don't think your question is actually about shuffling at all - it looks like what you want is a fast method to pick N items at random from a set of M items where N < M. Shuffling and picking the first N is just one way to accomplish this. Commented Jan 30, 2014 at 16:01
  • @Jayasagar I think the question is not a duplicate, since my point is not exactly shuffling. Instead, as Jackson pointed out, what I need is to pick N random elements from an array of length M > N. Commented Jan 31, 2014 at 0:30
  • see here for a strictly O(N) (both in time and space) algorithm Commented Aug 26, 2015 at 14:19

3 Answers 3

3

A simple way to get N shuffled elements from the array is as follows:

  1. Pick a random element r.
  2. Add element r to the output.
  3. Move the last element of the array to the position of r and shrink the array size by 1.
  4. Repeat N times.

In code:

public static int[] shuffle(int[] array, int N) {
    int[] result = new int[N];
    int length = array.length;

    Random gen = new Random();

    for (int i = 0; i < N; i++) {
        int r = gen.nextInt(length);

        result[i] = array[r];

        array[r] = array[length-1];
        length--;
    }

    return result;
}

This algorithm has the advantage over FY that it only computes the first N elements of the shuffled array, rather than shuffling the whole array.

Your optimized algorithm is not optimal for two reasons:

  1. The first N elements are never shuffled. For instance, element 0 can never appear at position 1 in the shuffled array.
  2. You're still doing a lot of work. If N=10 and the total array length is 1000000, you're still computing about 1000000 random values, while you only need 10.
Sign up to request clarification or add additional context in comments.

5 Comments

While I agree this approach would work, I think a good answer to this question should present an explanation of why this is a better approach that the ones suggested by the OP.
@Duncan I agree. I had some hard time understanding the OP's optimized algorithm. I verified it's correct (except for copying the first N elements without shuffling), but it's not efficient.
Why are you reinventing the wheel?
@hd1 The question is not to shuffle the array, but to get a random sublist of length N. Collections.shuffle does not do that.
@Heuster Your solution worked perfectly. The idea of placing the element at the end of the array is really good. @hd1 Reinforcing what Heuster said, this is not reinventing the wheel since Collections.shuffle does not work, and neither do the usual shuffle methods.
1

You can modify FY by only doing N iterations, and then taking the last N elements of the array. Alternatively, build the shuffled area at the start of the array, rather than the end, and take the first N elements. However, that will only save iterations of the actual shuffle, and the gain there will be a factor of N/50. That may not be enough.

Depending on the situation, you may be able to generate e.g. a million shuffled decks, and then pick one of them at random for each of the 1e9 outer iterations. That would do one random number generation per outer iteration, plus one thousandth of the current shuffle calls.

1 Comment

Your solutions seems nice. Although I have no tried neither of them, I think they would work. I have not chosen your answer because the one presented by Heuster fitted nicely in my code.
-1

Do try not to reinvent the wheel, when Collection.shuffle exists:

public static int[] shuffle(int[] array, int N) { 
    List<Integer> list = new ArrayList<Integer>();
    for (int r : array) {
       list.add(r);
    }
    Collections.shuffle(list);
    Integer[] ret = list.toArray(); // Integer is int, except autoboxed 
    return ret;
}

1 Comment

if the length of the array >> N, then your solution is time consuming and that's something OP doesn't want.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.