Database non-clustered index scan instead of seek

Question

Given

SELECT 
    a.Id, a.Col1, b.Col1
FROM 
    Table1 a 
INNER JOIN 
    Table2 b ON a.Id = b.A_Id
WHERE
    b.Col1 = 'Val1'
    AND a.Col3 = 'Val3'
    AND a.Col4 = 'Val4'

and non-clustered indices on Table1 and Table2

CREATE NONCLUSTERED INDEX IX_Table1 ON Table1(Id ASC, Col3 ASC, Col4 ASC)

CREATE NONCLUSTERED INDEX IX_Table2 ON Table2(Col1 ASC, A_Id ASC)

the above query does does an index scan on IX_Table1.

a.Id is the PK with clustered index and A_Id is a FK to a.Id.

Advisor proposes to have an index on Table1 (Col1, Col2).

I guess would optimizer be able to do index search (not scan) on IX_Table1, advisor would be fine.

Could somebody help figure out the details?

I see that in plan we first filter out by predicate in Table1 and Key Lookup for Id? And than Table2 is searched for Col1.

Can't it first use IX_Table1 for JOIN and than filter out by remaining Col3 and Col4 in the index?

The CLUSTERED INDEX column(s) are implicitly included in every index, so putting ID the index IX_Table1 isn't actually required here, and, in fact, the index you have won't help a seek to the right rows, as it's sorted by ID first, not Col1 or Col3. It's like asking for all the people who's first name in "Jane" in the phone book, when the names are listed in Surname order, and then first name. — Thom A
– Thom A ♦, Commented Apr 6, 2021 at 13:56
Do I understand correctly that Key_Lookup (second node from above) done on it's own? That is, Index Scan locates all nodes having specific values from WHERE and included Id is fed into nested loop? But Key Lookup just gives all values of PK whether matching first scan or not? — Nickolodeon
– Nickolodeon, Commented Apr 6, 2021 at 14:07
Can you paste your actual execution plan to brentozar.com/pastetheplan — marc_s
– marc_s, Commented Apr 6, 2021 at 14:09

Thom A · Accepted Answer · 2021-04-06 14:08:54Z

I'm going to use the same analogy I used in my comment here, as the phonebook idea works really well in my opinion.

Lets, for this example, say that the ID is the Surname (I realise there are dupes of this in a phonebook), that Col1, is the Firstname and Col3 is a Middle Initial. We're going to ignore the other table for the moment. The phonebook we have in ordered in that order too; first by the surname, then the first name and lastly the middle initial.

In your query you are asking for rows where the Firstname and Middle Initial have a specific value. As a result, your index isn't going to help you here; not be able to quickly find the people you want anyway. All your data is in the order of the surname, meaning that you would have go (individually) to each surname, then seek to the firstname, and filter to that data. As a result, for the database engine, it's likely just as fast to check every single value.

Now, let's add the other table in the this analogy. This is a list of surnames as well, which also has the person's gender identity. You are asked for just female people as well. This, however, has an index which list all the people by gender first, and then surname. Great! You can quickly grab all the female people's surname from there and filter your phonebooks data down with that, as it is also in surname order.

But now you still have the same problem; all that data is still in surname order, so you can't find people who's firstname and middle initial meets your requirements. So, once again, you have to scan the whole index.

As a result, yes the RDBMS is right; here an INDEX on Col1 and Col3, without ID (as it's the CLUSTERED INDEX, so already included) is exactly what you need to be able to sort your data in "first name" order.

Collectives™ on Stack Overflow

Database non-clustered index scan instead of seek

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related