1

I have a table screenshot with 3 fields:

CREATE TABLE `screenshot` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `UserID` int(11) NOT NULL,
  `DateTaken` date NOT NULL,
  PRIMARY KEY (`ID`),
  KEY `DateTaken` (`DateTaken`),
  KEY `UserID` (`UserID`) USING BTREE,
  CONSTRAINT `userID_foreign_key` FOREIGN KEY (`UserID`) REFERENCES `users` (`UserID`)
) ENGINE=InnoDB AUTO_INCREMENT=22514871 DEFAULT CHARSET=latin1

And

SELECT @@innodb_buffer_pool_size

Result: 16777216

Query:

SELECT COUNT(ID) total
        FROM screenshot WHERE DateTaken BETWEEN '2000-05-01' AND '2000-06-10'

Result : 2828844

Explain output:

ID|select_type|   table  |type |possible_keys|   key   |key_len| rows  |Extra
1 |  SIMPLE   |screenshot|range|  DateTaken  |DateTaken|  3    |5730138|Using where; Using index

Here is my problem: I have added index to DateTaken column and yet the scanning rows (Explain output) is bigger than the result. It seems like it does a whole scan table. And the Query runtime for the query takes 15 seconds. How can I improve the speed in the query above?

3 Answers 3

1

There is no problem. Your index is fine. To explain...

The 5730138 in EXPLAIN is an estimate. It can be larger or smaller than the actual value, sometimes by a large amount. Do not be bothered by it.

You have 2.8M of screenshots in that date range, correct? Well, it could take 15 seconds to scan the index to count that many rows.

If you would like further analysis, please provide:
RAM size
innodb_buffer_pool_size
SHOW CREATE TABLE screenshot; (this will show the Engine)
How big the table is (GB)
What type of disk you have (spinning versus SSD)

With those, we can discuss further the impact of caching and I/O and engine. And it may help explain the "15 seconds" versus "20".

(And, yes, use COUNT(*), not COUNT(x) unless you need to test x for NULL.)

If you are using InnoDB, then INDEX(DateTaken, id) is identical to INDEX(DateTaken), so I suggest you were hasty at accepting that answer.

Buffer pool

innodb_buffer_pool_size should be set to about 70% of RAM. What you have is so tiny (the old 16M default), that not even the suggested index can fit in cache. Hence, the query will always be hitting the disk, at least some of the time. Increasing the buffer pool should significantly improve the speed, perhaps down to 2 seconds.

Sign up to request clarification or add additional context in comments.

2 Comments

What is the difference between COUNT(*) and COUNT(x)? I just think that COUNT(x) is more faster, but I need to know why? Your help is greatly appreciated. :)
COUNT(x) has the extra overhead of checking x IS NOT NULL before increment the counter by 1. So, in almost all applications, COUNT(*) is better. Your timings to the contrary could be a fluke.
0

You could try adding a composite index

  create index test on screenshot (DateTaken, id)

2 Comments

thanks for the help. The speed has improved but the scanning row result is still the same. Do you know the reason why? :)
InnoDB indexes always include the primary key. Notice the OP's EXPLAIN output shows "Using index" which shows that even with the single-column index on just DateTaken. If it was marginally faster after following your suggestion, I suggest that was due to the index being fully loaded in the buffer pool.
0

Try running this query:

SELECT COUNT(*) as total
FROM screenshot
WHERE DateTaken BETWEEN '2000-05-01' AND '2000-06-10';

The reference to ID in the SELECT could be affecting the use of the index.

1 Comment

I got a runtime of 20 seconds.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.