2

I'm trying to randomise a certain number of rows but lets say there are only 4 rows in the database and i need to get 6 random rows I want the possibility (even if there are more than 6 rows in the table) to produce duplicate rows.

Is this easily achieved in mySQL ?

My current query is like this:

SELECT * FROM winners ORDER BY RAND() LIMIT 6

The idea is a user can win more than once. :)

Hope you can help! !

0

4 Answers 4

3

Any solution involving ORDER BY RAND() is frowned upon, because it can't use an index and it basically sorts the whole table (which may grow very large) just to pick one row.

The better solutions involve generating a random number between MIN(id) and MAX(id) and that's your chosen random row. As your table gets larger, this becomes a bigger and bigger advantage.

It's so much more efficient to pick a random ID, that I'd recommend just picking six random ID's one at a time, and then looking up those rows one at a time. Therefore you have a chance of picking a given row more than once.

If you aren't guaranteed that all your ID's are consecutive, you can pick the first ID that is greater than the random pick. So in pseudocode:

$MIN, $MAX = SELECT MIN(ID), MAX(ID) FROM winners
FOR LOOP FROM 1 to 6
    $R = $MIN+RANDOM($MAX-$MIN)
    $WINNER[] = SELECT * FROM winners WHERE id >= $R LIMIT 1
Sign up to request clarification or add additional context in comments.

5 Comments

If we have IDs [1,2,3,4,5,1000] then 1000 has a much greater chance of being picked.
If perfect stats are absolutely necessary then internal rowid of the sql server could be used.
@MichLH, MySQL doesn't have an internal rowid, at least not one you can query. One could use another table and fill it with consecutive integers, mapped to ID values in the original table.
So theres no way to for example say "pick the 5th row" rather than the row with id 5 ?
You could use LIMIT 1 OFFSET 5 to choose by position, but you'll find that choosing a row by value (aided by an index) is far better for performance.
2
SELECT * FROM winners ORDER BY RAND() LIMIT 1
UNION ALL
SELECT * FROM winners ORDER BY RAND() LIMIT 1
UNION ALL
SELECT * FROM winners ORDER BY RAND() LIMIT 1
UNION ALL
SELECT * FROM winners ORDER BY RAND() LIMIT 1
UNION ALL
SELECT * FROM winners ORDER BY RAND() LIMIT 1
UNION ALL
SELECT * FROM winners ORDER BY RAND() LIMIT 1

4 Comments

. . the downvotes appear to be random maliciousness.
I blocked one user who got rather irritated with me in chat i suspect he retaliated that way but can't be sure.
@Dave: You can be sure. Go to his profile, and then go to his Votes tab.
I down voted because the question started in chat with more specific requirements that this answer does not fit. I changed my vote to an up-vote when I noticed the question here lacks some details mentioned in chat.
1

Assuming you have at least one row, you can multiply the number of rows and then return randomly from that enlarged set:

SELECT w.*
FROM winners w cross join
     (select 1 as n union all select 2 union all select 3 union all select 4 union all
      select 5 union all select 6
     ) nums
ORDER BY RAND()
LIMIT 6;

4 Comments

Can you explain what this is doing ? Is it temporarily doubling all the rows in the table and selecting from that instead?
@Dave . . . The cross join is multiplying the number of rows, by six in this case (because there are six rows in the nums table). It then randomly selects from those enlarged results.
Assuming ORDER BY RAND() is O(n log n), this solution is O(6n log 6n), while my solution is O(6(n log n)), right?
@BarMar . . . It is actually hard to compare in terms of performance. This version reads the original table once, whereas yours reads it once per subquery (which could be significant if winners were a complex query). This is sorting six times the original data, whereas yours is sorting the original data six times (these have the same complexity because constant factors are ignored in complexity calculations).
0

This questions sounds like it could be the XY Problem. It sounds like you might be asking about a solution to your problem rather than your problem. See: https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem

I think it might be better to turn four rows into six in your application rather than selecting duplicate rows.

13 Comments

If only 4 people enter but there has to be 6 wins. Then how can i turn four rows into six? Its data duplication in the database which i am trying to avoid.
I would think it would be better to solve this problem in your application rather than selecting duplicate rows from the database. No? Create duplicates in your application, instead modifying the results in the select query.
create duplicates in the database is the complete opposite of an efficient database structure ;)
@ajb32x: Your suggested solution of duplicating app-side if there are less than 6 results won't work. OP said he would like to allow to possibility of dupes even when there are 6 or more records in the DB. I think OP is correct that this is a DB-level problem.
@ajb32x if it only uses the selected 6 for the 6 draws the rest of the rows in the database if there was others would not have equal chance of being selected in the 5 other draws right ?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.