2

I have a project through which I'm creating a game powered by a database.

The database has data entered like this:

(ID, Name) || (1, PhotoID),(1,PhotoID),(1,PhotoID),(2,PhotoID),(2,PhotoID) and so on. There are thousands of entries.

This is my current SQL statement:

$sql = "SELECT TOP 8 * FROM Image WHERE Hidden = '0' ORDER BY NEWID()";

But this can also produce results with matching IDs, where I need to have each result have a unique ID (that is I need one result from each group).

How can I change my query to grab one result from each group?

Thanks!

1

5 Answers 5

4

Since ORDER BY NEWID() will result in tablescan anyway, you might use row_number() to isolate first in group:

; with randomizer as (
  select id,
         name,
         row_number() over (partition by id
                            order by newid()) rn
    from Image
   where hidden = 0
)
select top 8
       id,
       name
  from randomizer
 where rn = 1
-- Added by mellamokb's suggestion to allow groups to be randomized
order by newid()

Sql Fiddle playground thanks to mellamokb.

Sign up to request clarification or add additional context in comments.

2 Comments

Hmm. Doesn't seem to randomize the overall groups. Wonder if throwing another order by newid() on the final query would fix it? EDIT: Ya that seems to do it: sqlfiddle.com/#!3/657ad/14
@mellamokb I was rather under impression that number 8 in top corresponds to number of groups because it is a game and I would expect a slot for each image group. I know that this is figment of my imagination though ;-)
2

Looks like this may work, but I can't vouch for performance:

SELECT TOP 8 ID,
  (select top 1 name from image i2
   where i2.id = i1.id order by newid())
FROM Image i1
WHERE hidden = '0'
group by ID
ORDER BY NEWID();

Demo: http://www.sqlfiddle.com/#!3/657ad/6

Comments

2

If you have an index on the ID column and want to take advantage of the index and avoid a full table scan, do your randomization on the key values first:

WITH IDs AS
(
  SELECT DISTINCT ID
  FROM Image
  WHERE Hidden = '0'
),
SequencedIDs AS
(
  SELECT ID, ROW_NUMBER() OVER (ORDER BY NEWID()) AS Seq
  FROM IDs
),
ImageGroups AS
(
  SELECT i.*, ROW_NUMBER() OVER (PARTITION BY i.ID ORDER BY NEWID()) Seq
  FROM SequencedIDs s
  INNER JOIN Image i
    ON i.ID = s.ID
  WHERE s.Seq < 8
  AND i.Hidden = '0'
)
SELECT *
FROM ImageGroups
WHERE Seq = 1

This should drastically reduce the cost over the table scan approach, although I don't have a schema big enough that I can test with - so try running some statistics in SSMS and make sure ID is actually indexed for this to be effective.

1 Comment

Note - in the sqlfiddle sandbox this is significantly cheaper than mellamokb's answer and only slightly higher than Nikola's - however, the sample size is extremely small, and I believe this would perform better on a very large number of rows because it does not need to scan them all - only 1 row per initial group and all rows for the much smaller top N random groups.
1
select * from (select * from photos order by rand()) as _SUB group by _SUB.id;

2 Comments

The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions: sqlfiddle.com/#!3/657ad/9
@mellamokb everything is allowed in the world of mysql, yet i just noticed, sql-server is the tag... my apologies.
0
 select ID, Name from (select ID, Name, row_number() over 
 (partition by ID, Name order by ID) as ranker from Image where Hidden = 0 ) Z where ranker = 1
 order by newID()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.