3

I have the following query:

SELECT
    COUNT(*)
FROM
    FirstTable ft
        INNER JOIN SecondTable st ON ft.STID = st.STID

As you can guess, "STID" is the primary key on "SecondTable"... and "FirstTable" will have a pointer to that second table. Here are the indexes that I have:

FirstTable: NONCLUSTERED INDEX on column "STID"

SecondTable: CLUSTERED PRIMARY KEY INDEX on "STID"

The query above gives me a subtree cost of 19.90 and takes 2 seconds.

After running the database tuning advisor for that query, they suggested making the very same index that I had on second table... but non-clustered. So I tried it with these results.

FirstTable: NONCLUSTERED INDEX on column "STID"

SecondTable: NONCLUSTERED INDEX on "STID"

Now, the query above gives me a subtree cost of 10.97 and takes <1 second!

This 100% shatters my brain... Why would a NONCLUSTERED index perform faster than a CLUSTERED index in this scenario?

1 Answer 1

7

Because your query does not retrieve any actual records from the tables, it just counts.

With the non-clustered indexes, it just joins two indexes (which are smaller in size than tables) using most probably MERGE JOIN.

With a clustered index, it has to join the table and the non-clustered index. The table is larger and it takes more time to traverse it.

If you issue a query like this:

SELECT  SUM(first_table_field + second_table_field)
FROM    FirstTable ft
INNER JOIN
        SecondTable st
ON      ft.STID = st.STID

which retrieves actual values, you will see the benefits of clustering.

Sign up to request clarification or add additional context in comments.

8 Comments

That was fast! - That is exactly the clear, logical answer I was looking for. - Of course this isn't my real query, but it does help my real query (which wasn't using columns from that table, but was just using it to inner join / filter) - Thx!
@Quassnoi: sorry, this answer doesn't make sense to me.You said: "with a clustered index, it has to join the table". A clustered index is an index. The original query doesn't really retrieve any column so all it has to do match the key values, i.e. it needs to scan the index pages only. If you look at these pictures, technet.microsoft.com/en-us/library/… and technet.microsoft.com/en-us/library/…, the difference is that, in the clustered index you have data in the leaf nodes, but query doesn't need to go to the leaf level.
@quassnoi: To continue... unless using * does something to the query optimizer and it would better to use count(1) instead. Based on my experience with SQL Server, especially 2008 and 2008 R2, is that sometimes it does these brain dead things that you'd never expect.
@costa: the clustered index is the table itself. The query does need to go to the leaf level, in both cases. COUNT(*) and COUNT(1) don't make any difference in SQL Server.
@Quassnoi: ok, I don't know the internals of the clustered indexes implementation, but I know there are models of b-trees where the keys are also saved in in the intermediary nodes, so to find a key you don't necessarily need to go to the lowest level in the index. My point was that sql server has to traverse only the index portion of the clustered index. But, the clustered index is much bigger (it contains the data), so even when you scan the index, there might be a lot of jumps between pages, while the other index is more compact.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.