1

I have a question about optimizing sql queries with multiple index.

Imagine I have a table "TEST" with fields "A, B, C, D, E, F".

In my code (php), I use the following "WHERE" query :

  • Select (..) from TEST WHERE a = 'x' and B = 'y'
  • Select (..) from TEST WHERE a = 'x' and B = 'y' and F = 'z'
  • Select (..) from TEST WHERE a = 'x' and B = 'y' and (D = 'w' or F = 'z')

what is the best approach to get the best speed when running queries?

3 multiple Index like (A, B), (A, B, F) and (A, B, D, F)? Or A single multiple index (A, B, D, F)?

I would tend to say that the 3 index would be best even if the space of index in the database will be larger. In my problem, I search the best execution time not the space. The database being of a reasonable size.

5 Answers 5

3

Multiple-column indexes:

MySQL can use multiple-column indexes for queries that test all the columns in the index, or queries that test just the first column, the first two columns, the first three columns, and so on. If you specify the columns in the right order in the index definition, a single composite index can speed up several kinds of queries on the same table.

In other words, it is a waste of space an computing power to define an index that covers the same first N columns as another index and in the same order.

Sign up to request clarification or add additional context in comments.

Comments

1

The best way to exam the index is to practice. Use "explain" in mysql, it will give you a query plan and tell you which index to use. In addition, it will give you an estimate time for your query to run. Here is an example

explain select * from TEST WHERE a = 'x' and B = 'y'

Comments

0

It is hard to give definitive answers without experiments.

BUT: ordinarily an index like (A,B,D) is considered to be superfluous if you have an index on (A,B,D,F). So, in my opinion you only need the one multicolumn index.

There is one other consideration. If your table has a lot of columns and a lot of rows and your SELECT list has a small subset of those columns, you might consider including those columns in your index. For example, if your query says SELECT D,F,G,H FROM ... you should try creating an index on

(A,B,D,F,G,H)

as it will allow the query to be satisfied from the index without having to refer back to the rows of the table. This can sometimes help performance a great deal.

Comments

0

It's hard to explain well, but generally you should use as few indexes as you can get away with, using as many columns of the common queries as you can, with the most commonly queried columns first.

In your example WHERE clauses, A and B are always included. These should thus be part of an index. If A is more commonly used in a search then list that first, if B is more commonly used then list that first. MySQL can partially use the index as long as each column (seen from the left) in the index is used in the WHERE clause. So if you have an index ( A, B, C ) then WHERE ( A = .. AND B = .. AND Z = .. ) can still use that index to narrow down the search. If you have a WHERE ( B = .. AND Z = .. ) clause then A isn't part of the search condition and it can't be used for that index.

You want the single multiple column index A, B, D, F OR A, B, F, D (only one of these at a time can be used), but which depends mostly on the number of times D or F are queried for, and the distribution of data. Say if most of the values in D are 0 but one in a hundred values are 1 then that column would have a poor key distribution and thus putting the index on that column wouldn't be all that useful.

Comments

0

The optimiser can use a composite index for where conditions that follow the order of the index with no gaps:

An index on (A,B,F) will cover the first two queries.

The last query is a bit trickier, because of the OR. I think only the A and B conditions will be covered by (A,B,F) but using a separate index (D) or index (F) may speed up the query depending on the cardinality of the rows.

I think an index on (A,B,D,F) can only be used for the A and B conditions on all three queries. Not the F condition on query two, because the D value in the index can be anything and not the D and F conditions because of the OR.

You may have to add hints to the query to get the optimiser to use the best index and you can see which indexes are being used by running an EXPLAIN ... on the query.

Also, adding indexes slows down DML statements and can cause locking issues, so it's best to avoid over-indexing where possible.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.