1

I would like to know the impact on performance if I run this query in the following conditions.

Query:

select   `players`.*, count(`clicks`.`id`) as `clicks_count` 
from     `players` left join `clicks` on `clicks`.`player_id` = `players`.`id`
group by `players`.`id`
order by `clicks_count` desc 
limit    1

Conditions:

  1. In the clicks table I expect to get insert 1000 times in a 1 minute
  2. The clicks table will contain more then 1,000,000 rows
  3. The players table will contain 10,000 rows
  4. The players table get inserted into every 5 minutes

I would like to know what to expect performance-wise if I run the query 1000 times in 1 minute.

Thanks

6
  • 3
    Impossible to tell without knowing a lot of things about your server and setup. Why not simply try out? Commented May 22, 2011 at 20:08
  • @Yonathan, the query as such looks fine, don't worry about performance until you actually hit slowness, than come back and ask a question about it with some details. "Premature optimization is the root of all evil" -- Donald Knuth. Commented May 22, 2011 at 20:12
  • ok, thanks. same as always Trial and error! Commented May 22, 2011 at 20:17
  • 1
    If things get slow, EXPLAIN can sometimes give you clues as to how your query is being done. Here's a friendly tree based version: xaprb.com/blog/2007/07/29/introducing-mysql-visual-explain Commented May 22, 2011 at 20:19
  • Make sure to use transactions. There is a big difference between 1000 INSERTS per second and 1000 COMMITS per second (good luck!). Also decide which is more important -- inserts or queries. Indexes will speed up queries (if covering correctly) but require more work to maintain. Extra indexes may actually hurt both query (if they muck up the plan) and insert performance. Commented May 22, 2011 at 20:32

2 Answers 2

2

That query will never run in milliseconds with any meaningful amounts of data in your tables. It'll run two full table scans, join the two together, aggregate the mess, and fetch the top row from that.

Use a trigger to store the total in the players, and index that field. You'll then be able to avoid the join altogether:

select p.* from players p order by clicks_count desc limit 1
Sign up to request clarification or add additional context in comments.

7 Comments

i like the idea with the trigger, can you please show me how to declare one for my situation. i am not familiar with triggers
and what about just adding to the players table column with clicks_count and every click just adding 1 is it will be beter then a trigger?
yes thats exactly what Denis wrote, add that column and update it by adding 1 for every click either by a trigger or just separate update query. If you don't need to store additional info about a click like a date, who clicked or sth like that, you can even drop clicks table and just update clicks_counter in players.
@Denis when you write That query will never run in milliseconds so how worse it could get?
@yonathan: with millions and billions of rows, it will run into the minutes, hours and days.
|
0

First & foremost, you should worry about your schema if you want decent performance with that number of records and frequent writes; i.e. proper indexes and constraints must be created if not already in place.

Next, the query itself, select the minimum number of fields needed (so if you do not need ALL players field, avoid using "players.*").

Personal pref, I'd restructure tables (e.g. playerID in place of id) and query like so:

SELECT p.*, COUNT(c.id) as clicks_count
FROM players p
JOIN clicks c USING(playerID)
GROUP BY p.playerID
ORDER BY clicks_count desc 
LIMIT 1

Again, see if you really need ALL player table fields; if not, omit "p.*" and replace with p.foo, p.bar, etc.

3 Comments

thanks for the tip. but i like to know if this situation is normal and it colud be handled and how shold i handle it
well, you are dealing with a large record set and frequent writes -- just see how it goes (in production), lol, nice suggestion ;--) Plan ahead I say (so create PK on playerID in players table and add FK on playerID in clicks table; then select minimum fields necessary; have your read queries set to read only; allocate sufficient cpu cycles and memory to handle the load, determined by load testing prior to putting anything in production)
fair enough, of course, we're all relatively in the dark as to his situation. @denis has it nailed performance-wise querying against a single table

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.