Index usage explanation

Question

Consider this test setup:

CREATE TABLE dept (deptid integer PRIMARY KEY, deptname TEXT);
CREATE INDEX dept_name_idx on dept(deptname);

The dept table contains 1000 rows and the deptname column contains 10 unique values that are evenly distributed.

Which of the following two sample queries would use the index dept_deptname_idx?

1) SELECT deptid from dept where deptname ='SAPA';

2) SELECT deptid from dept where deptname <>'SAPA';

So do you have your answer?

Erwin Brandstetter
– Erwin Brandstetter

2018-03-14 00:09:17 +00:00
Commented Mar 14, 2018 at 0:09 — Erwin Brandstetter
– Erwin Brandstetter, Commented Mar 14, 2018 at 0:09

Erwin Brandstetter · Accepted Answer · 2018-03-08 17:27:54Z

2

With only 10 distinct values, evenly distributed, chances are that neither query will use the index. A sequential scan of the table is typically faster than involving any indexes when retrieving more than roughly 5 % of all rows. Exact numbers depend on many details.

Also, 1000 small rows like in your example fit on a hand full of data pages. A sequential scan is hard to beat with such a small table.

With a much bigger table and/or substantially more distinct values in deptname, query 1 would be a candidate for using the index, but not query 2 (which retrieves most rows and will always use a sequential scan).

To optimize read performance for query 1 you could then use a multicolumn index on (deptname, deptid) - if preconditions for index-only scans are met.

edited Mar 8, 2018 at 17:27

answered Mar 8, 2018 at 17:12

Erwin Brandstetter

669k160 gold badges1.2k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Index usage explanation

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related