14

The text book tells me that it is not recommended to use index for enumerated datatypes. But it didn't tell me why. Should I use index for ENUM? The book also tells me that we should index column which we use in WHERE clause. I always use ENUM in WHERE part of my query and it should be indexed according to the book. And it also says not to index enumerated datatypes. Now what should I do?

Edit:

I think I made a mistake while asking, I just read the same book again and I think I got a misunderstanding while reading, the book didn't explicitly said we should not use index for ENUM but it said that we should not use index for columns that have very limited range of values such as yes/no, 0/1 etc. And the thing I grabbed from the book is that such columns are of ENUM types.

7
  • ENUMS can also have NULL values hence they should not be indexed .. dev.mysql.com/doc/refman/5.7/en/enum.html Commented Oct 4, 2017 at 13:50
  • I don't think I have a single column on my system that uses an ENUM type. Just sayin'. Commented Oct 4, 2017 at 13:51
  • @Strawberry .. having enums or not in the database is personal preference but it does provide some database level security against users who might modify a form field and input values which are not supposed to be in the database.. for e.g:- gender column if not made enum than you can have somebody enter fem into database and you will then have issues when selecting your data Commented Oct 4, 2017 at 13:54
  • 1
    @DhavalChheda wrong on both accounts. 1. The fact that a field may contain nulls does not mean it should not be indexed. 2. You can use lookup tables with referential integrity to force a user to choose from a range of values. Although, most of these restrictions are enforced in the GUI of an application. Commented Oct 4, 2017 at 14:00
  • @user15 I'm not aware of any reasons against indexing an enum field. Aren't you mixing up enum's internal index numbering with the database level indexes? Commented Oct 4, 2017 at 14:02

4 Answers 4

20

I just want to share my personal experience with an index on enums. I had a really slow query and found this while googling, which kind of discouraged me. But eventually I tried adding an index to my enum column anyhow.

My query was this:

SELECT * FROM my_table
WHERE my_enum IN ('a', 'b')
ORDER BY id DESC
LIMIT 0, 100;

The id column is the primary key. I have 25.000 rows in my_table. There are 4 possible values for my_enum.

Without an index on my_enum, the query took around 50 seconds to complete. With an index it takes 0.015.

This was on a 12 core Xeon Gold MySQL 8.0 server.

Sign up to request clarification or add additional context in comments.

1 Comment

I don't suppose the ordering would influence the outcome..? No type of selecting or indexing or lack thereof should cause a query on just 25k rows to take 50 seconds, that's hard to believe.
18

The enum data type is simply stored as a number (the position of the list item value within the list):

The strings you specify as input values are automatically encoded as numbers.

Thus, an enum field can be indexed just as any other numeric fields.

2 Comments

Yes I got a misunderstanding while reading the book but why the book told me not to index columns that have very limited range of values such as yes/no, 0/1 etc.
Yep, that's a different story. Indexes on fields with such limited variety of values offer limited selectivity, therefore the optimisers tend to ignore them. You can still use them as part of a multi-column index.
4

The reason we do not want to index a column with a small number of possible values is because of the nature of index itself. The common data structure of index is a balanced-tree with leaf node as linked list, which only supports fast lookup when variety of the values is huge. Otherwise, all the redundant values will be stored in a linked list which is not quite different from scanning the whole table, and sometimes it would be even slower if it needs to fetch the rows one by one from the table.

Comments

-1

I was unsure myself so I did a little experiement. Following the MySQL Docs

I created a dummy table shirts and ran queries with index and without index on enum column size.

enter image description here

Table as roughly 2 million records enter image description here

Without Index

enter image description here

With Index

enter image description here

Conclusion

Query times didn't change significantly for me.

5 Comments

you mean, the EXPLAIN queries' times didn't change for you?
No @alexbusu, I am using EXPLAIN statement to check the actual run time of the query.
The execution time is for "EXPLAIN" queries, not for actual queries. And what you have in the result of an EXPLAIN query are some data you should use to improve the DB performance.
In the "no-index" screenshot you have key NULL and therefore the rows to scan are over 2.3 million, the end result containing 20% of the rows, applying the WHERE condition. In the "with-index" screenshot, the used index key is shirts_size_idx and the number of rows depend on the filter input, for medium it's >580K, for small it's >37K; the filtered value is 100%, since no data was filtered using additional conditions - no WHERE was used, but only the index condition, which is much faster compared to WHERE. In general yo want to have filtered as close as or equal to 100.
Thanks @alexbusu, I understand now. Because of Index we are scanning less rows compared to without index.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.