Need an explanation about a slow query with MariaDB

Question

I have a "simple" query how takes 0.7678 seconds or more to be executed with MariaDB.

Here is the query:

select  `referenceNumber`
    from  `invoice`
    where  `groupId` = 3550
      and  `referenceNumber` >= 301
    order by  `referenceNumber` desc
    limit  1;

These columns have an index: "referenceNumber", "groupId"

Here is the result of an EXPLAIN:

I found a solution by creating a subquery like that:

select  `referenceNumber`
    from  (
        SELECT  id
            from  `invoice`
            where  `groupId` = 3550
              and  `referenceNumber` >= 301
          ) as subquery
    JOIN  invoice as invoice  ON invoice.id = subquery.id
    order by  `referenceNumber` desc
    limit  1;

This query takes like 0.0011 seconds.

Here is the result of an EXPLAIN:

Do you have an explanation about the poor performance of the first query?

Two surprising findings:

The query without the where `groupId` = 3550 takes only 0.0005 seconds like that:

select  `referenceNumber`
    from  `invoice`
    where  `referenceNumber` >= 301
    order by  `referenceNumber` desc
    limit  1;

The query without the order by `referenceNumber` desc takes only 0.0011 seconds like that:

select  `referenceNumber`
    from  `invoice`
    where  `groupId` = 3550
      and  `referenceNumber` >= 301
    limit  1;

Here is the schema of this table:

CREATE TABLE `invoice` (
  `id` int(10) UNSIGNED NOT NULL,
  `groupId` int(11) NOT NULL,
  `referenceNumber` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
            COLLATE=utf8mb4_unicode_ci;

ALTER TABLE `invoice`
  ADD PRIMARY KEY (`id`),
  ADD KEY `invoice_groupid_index` (`groupId`),
  ADD KEY `invoice_referencenumber_index` (`referenceNumber`);

ALTER TABLE `invoice`
  MODIFY `id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT;
COMMIT;

Thank you really much for your help!

It helps a lot if you paste in the results as plain text instead of smudgy little screenshots, especially ones with truncation. — tadman
– tadman, Commented Sep 30, 2022 at 9:15
Have you ever tuned your server? Some people run stock configurations that are absolutely starved for memory. — tadman
– tadman, Commented Sep 30, 2022 at 9:16
You also need an index on groupId, referenceNumber for the type of query you're doing. Having the groupId index only gets you so far, the rest of the rows have to be matched with a scan. — tadman
– tadman, Commented Sep 30, 2022 at 9:17
If there's one thing to learn first about server performance it's that innodb_buffer_pool_size is absolutely critical. If it's set too small, you're sacrificing a ton of performance for no reason. — tadman
– tadman, Commented Sep 30, 2022 at 9:24
@tadman you have completely right!! I created an index with groupId, referenceNumber and now the first query takes only 0.0005 seconds. Amazing, thank you really much!! — Clément B.
– Clément B., Commented Sep 30, 2022 at 9:27

tadman · Accepted Answer · 2022-09-30 09:33:44Z

2

When it comes to indexes, having an index on column A and column B will not help if there's a query involving both of them.

Adding an index on A creates a look-up table to records with various A values, and can provide support for ORDER, BETWEEN and other operations involving ordered values. Importantly it does not account for the order of anything B related.

Likewise, an index on B does much the same thing, ignoring the order of A.

In general, if you want to query WHERE A=? ORDER BY B then you need an index on A,B. This creates an index with data sorted on A, then sub-sorted (for equal values of A) on B. This makes comparisons very quick, they can often happen entirely within the index.

answered Sep 30, 2022 at 9:33

tadman

212k23 gold badges237 silver badges266 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Clément B. Over a year ago

Thank you really much for your detailed explanation. I followed your suggestion and added an index on A,B. The request is really fast now. You solved my problem in 2 minutes :)

Rick James · Accepted Answer · 2022-09-30 16:22:47Z

The subqueries are overkill. They may involve building a temp table, which adds cost. (There are other situations where this is actually optimal, but not here.)
The LIMIT without an ORDER BY gives you a random item; don't do it that way. This is, in my opinion, not valid, so it does not matter how fast it is.
For performance, the rest need you to replace KEY(groupId) with
```
  INDEX(groupId, referenceNumber)  -- in this order
```
Some of your formulations lead the optimizer into a quandary of whether to use KEY(groupId) or KEY(referenceNumber). Each has its pros and cons. But the Optimizer does not have adequate information to choose between them. The composite index above will home in on the single row that is desired.
Note: My suggested index is both "composite" and "covering". See those terms for further reference.

Collectives™ on Stack Overflow

Need an explanation about a slow query with MariaDB

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related