Generate different random numbers within a range [duplicate]

Question

I want to generate n different numbers between 1 and N (of course n<=N). N could be very large. If n is very small, one efficient way is generating a numbers and compare it with the set we have got to make sure it's a new number. It takes O(n^2) time and O(n) memory. If n is quite large, we can use Fisher–Yates shuffle algorithm to generate a random permutation( stop after n steps). It takes O(n) time, but we also must use O(N) memory.

Here is the question. What can we do if we do not know how large n is? I hope that the algorithm just use O(n) memory and stop after O(n) time. Is that possible?

That's a pretty poor duplicate - N there is 1000, here could be "very large". — jrok
– jrok, Commented Oct 31, 2013 at 15:14
@jrok: Fair enough, close vote retracted. I do note however an O(1)-space solution on that page: stackoverflow.com/a/202225/47984. — j_random_hacker
– j_random_hacker, Commented Oct 31, 2013 at 15:18
@Floris: The way I interpret it is that they want an online algorithm -- i.e. one where it's always possible to cheaply add a new, distinct sample later on. — j_random_hacker
– j_random_hacker, Commented Oct 31, 2013 at 15:28
@j_random_hacker: If you implement the set as a hash table you can get O(1) (at least for the expected time). — Jerry Coffin
– Jerry Coffin, Commented Oct 31, 2013 at 15:45

bames53 · Accepted Answer · 2013-10-31 15:23:27Z

0

You can essentially do the same as for very small n, but just make that check more efficient. For example the naïve method of checking if you've already generated a number is to just linearly search the list of previously generated values. For an unknown n you could keep the set of previously generated values sorted so that you can use a more efficient search for identifying duplicates. With the naïve approach the algorithm takes O(n²) time, but a smarter search through previous results can reduce that to O(n*log₂ n).

answered Oct 31, 2013 at 15:23

bames53

88.7k15 gold badges191 silver badges255 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

j_random_hacker Over a year ago

Inserting a value into a sorted array is O(n) time though. And if n is a large enough fraction of N that duplicates turn up often then an exponential term appears in the time complexity to account for the time spent rerunning it, and will dominate it.

rici Over a year ago

@j_random_hacker: The array doesn't need to be sorted. It could just as easily be a hash table.

bames53 Over a year ago

@j_random_hacker so don't use an array. A tree can have O(log n) searching and insertion and is easy to keep sorted.

j_random_hacker Over a year ago

@bames53: I would suggest changing your answer to say that, but I don't think it addresses the main problem I identified -- the exponential time complexity that results when already-selected numbers become sufficiently dense.

rici Over a year ago

@j_random_hacker: You can exponentially rehash a hash table resulting in amortized linear insertion cost, just like a vector. Random numbers are ideal as hash-table keys, too.

|

Collectives™ on Stack Overflow

Generate different random numbers within a range [duplicate]

1 Answer 1

6 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Linked

Related