SQL Index Question: Why does SQL Server prefer this NONCLUSTERED index to a CLUSTERED one?

Question

I have the following query:

SELECT
    COUNT(*)
FROM
    FirstTable ft
        INNER JOIN SecondTable st ON ft.STID = st.STID

As you can guess, "STID" is the primary key on "SecondTable"... and "FirstTable" will have a pointer to that second table. Here are the indexes that I have:

FirstTable: NONCLUSTERED INDEX on column "STID"

SecondTable: CLUSTERED PRIMARY KEY INDEX on "STID"

The query above gives me a subtree cost of 19.90 and takes 2 seconds.

After running the database tuning advisor for that query, they suggested making the very same index that I had on second table... but non-clustered. So I tried it with these results.

FirstTable: NONCLUSTERED INDEX on column "STID"

SecondTable: NONCLUSTERED INDEX on "STID"

Now, the query above gives me a subtree cost of 10.97 and takes <1 second!

This 100% shatters my brain... Why would a NONCLUSTERED index perform faster than a CLUSTERED index in this scenario?

Quassnoi · Accepted Answer · 2010-03-03 12:13:41Z

7

Because your query does not retrieve any actual records from the tables, it just counts.

With the non-clustered indexes, it just joins two indexes (which are smaller in size than tables) using most probably MERGE JOIN.

With a clustered index, it has to join the table and the non-clustered index. The table is larger and it takes more time to traverse it.

If you issue a query like this:

SELECT  SUM(first_table_field + second_table_field)
FROM    FirstTable ft
INNER JOIN
        SecondTable st
ON      ft.STID = st.STID

which retrieves actual values, you will see the benefits of clustering.

answered Mar 3, 2010 at 12:13

Quassnoi

428k94 gold badges628 silver badges623 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Timothy Khouri Over a year ago

That was fast! - That is exactly the clear, logical answer I was looking for. - Of course this isn't my real query, but it does help my real query (which wasn't using columns from that table, but was just using it to inner join / filter) - Thx!

boggy Over a year ago

@Quassnoi: sorry, this answer doesn't make sense to me.You said: "with a clustered index, it has to join the table". A clustered index is an index. The original query doesn't really retrieve any column so all it has to do match the key values, i.e. it needs to scan the index pages only. If you look at these pictures, technet.microsoft.com/en-us/library/… and technet.microsoft.com/en-us/library/…, the difference is that, in the clustered index you have data in the leaf nodes, but query doesn't need to go to the leaf level.

boggy Over a year ago

@quassnoi: To continue... unless using * does something to the query optimizer and it would better to use count(1) instead. Based on my experience with SQL Server, especially 2008 and 2008 R2, is that sometimes it does these brain dead things that you'd never expect.

Quassnoi Over a year ago

@costa: the clustered index is the table itself. The query does need to go to the leaf level, in both cases. COUNT(*) and COUNT(1) don't make any difference in SQL Server.

boggy Over a year ago

@Quassnoi: ok, I don't know the internals of the clustered indexes implementation, but I know there are models of b-trees where the keys are also saved in in the intermediary nodes, so to find a key you don't necessarily need to go to the lowest level in the index. My point was that sql server has to traverse only the index portion of the clustered index. But, the clustered index is much bigger (it contains the data), so even when you scan the index, there might be a lot of jumps between pages, while the other index is more compact.

|

Collectives™ on Stack Overflow

SQL Index Question: Why does SQL Server prefer this NONCLUSTERED index to a CLUSTERED one?

1 Answer 1

8 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Your Answer

Sign up or log in

Post as a guest

Related