How does MySQL handle SELECT queries on partitioned tables?

Question

When querying a non-partitioned table, the query optimizer can leverage indexes for sorting and limit the data read based on the LIMIT clause. For example, in a non-partitioned table my_table with a single-column index idx_a on column a, the following

SELECT *  
FROM my_table  
ORDER BY a DESC  
LIMIT 100;

This query can scan the idx_a index from the end and stop after reading 100 rows, regardless of the total number of rows in my_table.

Now, consider a partitioned table my_partition_table partitioned by the primary key. Suppose the same query is run:

SELECT *  
FROM my_partition_table  
ORDER BY a DESC  
LIMIT 100;

In this case, the query does not use filesort, as confirmed by the EXPLAIN plan. However, since the table is partitioned by the primary key, column a spans across all partitions.

How does MySQL handle sorting in this situation? Specifically, how does it retrieve and merge the data from all partitions to produce the sorted result for column a efficiently?

I don't know about MySQL but if you have N partitions and each of them has its own index on a then a merge join algorithm would be able to produce a globally sorted result. What does the EXPLAIN say? — Martin Smith
– Martin Smith, Commented Nov 30, 2024 at 8:01

Rick James · Accepted Answer · 2024-11-30 22:43:49Z

3

You have found one of many ways that PARTITIONing (in MySQL) is no better, possibly worse than non-partitioning.

Partitioning by the PRIMARY KEY is never(?) better then the equivalent non-partitioned table.
Partitioning by the PK, then looking up by the PK (either point query or range query) will do "partition pruning". But such pruning is no faster than using the BTree.
Partition by PK, then looking up by a secondary index -- all partitions need to be looked at. Then, depending on other things, it will probably still have to gather and sort the found rows. (I doubt if it is smart enough to merge; I have not heard that it will do the partitions in parallel.)
If the WHERE clause needs a 2-dimensional index, partitioning can sort of provide such -- pruning for one of the dimensions, then one ore more index look ups for the other dimension.
In general, the only way to hope to get any benefit if by having the "partition key" be something other than the PK or any other index.
I see no case in which partitioning will speed up your query (ORDER BY a LIMIT 100 with INDEX(a)). An equivalent non partitioned will:
```
find the first `a` in the index
reach into the data for the corresponding row
move on to the next `a` and get its row
stop at 100.
```

More discussion: Partition

edited Nov 30, 2024 at 22:43

answered Nov 30, 2024 at 22:38

Rick James

144k15 gold badges144 silver badges255 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Bill Karwin Dec 1, 2024 at 16:17

If the table has a primary key, you can't make the partition key be independent of that primary key. The partition key column(s) must be included in all primary or unique keys of the table.

Collectives™ on Stack Overflow

How does MySQL handle SELECT queries on partitioned tables?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related