The execution plans below seem disappointing and suboptimal, even though the queries are straightforward.
I'm using MySQL 5.7. Here is the fiddle (although it only offers 5.6).
CREATE TABLE `event` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`name` VARCHAR(63) CHARSET ASCII COLLATE ASCII_BIN NOT NULL,
`is_sequenced` TINYINT(3) UNSIGNED NOT NULL,
`sequence_number` BIGINT(20) UNSIGNED DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `Name-SequenceNumber` (`name`,`sequence_number`),
KEY `Name-IsSequenced` (`name`,`is_sequenced`,`id`)
) ENGINE=INNODB
;
INSERT INTO `event`
(id, `name`, is_sequenced, sequence_number)
VALUES
(NULL, 'OrderCreated', 0, NULL),
(NULL, 'OrderCreated', 0, NULL),
(NULL, 'OrderCreated', 0, NULL),
(NULL, 'OrderCreated', 0, NULL),
(NULL, 'OrderCreated', 0, NULL),
(NULL, 'OrderCreated', 0, NULL),
(NULL, 'OrderCreated', 0, NULL),
(NULL, 'OrderCreated', 0, NULL),
(NULL, 'OrderCreated', 0, NULL)
;
We will be using the Name-IsSequenced secondary index. Let's try the following EXPLAIN. (The queries are in the Fiddle. Open "View execution plan" to see their EXPLAIN result.)
EXPLAIN
SELECT * -- This part needs the PK
FROM `event` e
WHERE e.name = 'OrderCreated'
AND e.is_sequenced = 0
AND e.id <= 3
;
So far, so good. Using index condition makes sense: the entire condition can be resolved on the expected index Name-IsSequenced, and then the PK is needed to get the remaining data for SELECT *.
We should be able to improve that to Using index if we only select something that is part of our secondary index, right? (Note that the PK is always part of any secondary index, but we can even ensure this by including id at the end of our secondary index. The result is the same, as it should be.)
EXPLAIN
SELECT id
FROM `event` e
WHERE e.name = 'OrderCreated'
AND e.is_sequenced = 0
AND e.id <= 3
;
Now, the result is Using where; Using index. Wait, that's... worse?! We made it do less work, and the plan indicates that it is working harder.
Using index should be achievable. Find the range where name=OrderCreated, then inside it find the subrange where is_sequenced=0, then inside that find the subrange where id<=3.
Curiously, I have other experiments (with some more data) where I can get Using index by changing id<=3 to id=3 (combined with FORCE INDEX to prevent it from prefering to the PK). I see no reason for the difference. (If we try this with the Fiddle, it stays the same - perhaps because of the small data set.)
Can anybody explain why the execution plan does not indicate the expected efficient use of the secondary index? Is there are a way to straighten it out?
SELECT id ...query run slower than the other one?SQLyogIDE crashed when I had the (more complex) queries on the larger data sets open. I'll need to set up a new test on more data to answer that.Using where; Using indexis better thanUsing index condition.Using index(without theUsing where;before it) is better. That's true, but that's not what we're seeing here. It's what I'd expected to see, though!