1

Working in a datawarehouse environment (SQL Server 2008) and there are a few tables that have about 2 million rows and 20 columns. Each night, the tables are dropped and re-created. When they are made, indexes are also built. For some reason there is no clustered index on these tables. There are however, unique nonclustered indexes. Seems unlogical. Does anybody know any downsides to changing that all so that all these tables have a clustered index. Should save some space and perform better.

Any tips?

Thanks in advance.

7
  • AFAIK all tables in SQL Server have a clustered index, whether or not you declare one as such. Commented Nov 24, 2017 at 6:32
  • 2
    @TimBiegeleisen: I don't think that is true. If you create the PK as "nonclustered", SQL Server will not create a hidden clustered index. Are you maybe confusing that with MySQL? Commented Nov 24, 2017 at 7:00
  • @a_horse_with_no_name By default SQL Server will create a clustered index in the background (see here). Are you certain that this can be turned off, other than by creating another named index? Commented Nov 24, 2017 at 7:03
  • @TimBiegeleisen create table foo (id integer primary key nonclustered) Commented Nov 24, 2017 at 7:07
  • @a_horse_with_no_name OK but that still creates an index. I think I phrased my first comment wrongly. Commented Nov 24, 2017 at 7:08

1 Answer 1

2

Indeed clustered indexes have drawbacks too.

I think the most under-estimated drawback is what I refer to as clustered index penalty.

If you don't have any clustered index on a table, it means the table is stored as a heap table. All non-clustered indexes refer to that heap table.

The good thing about heap tables is that rows stored in them hardly ever move — unlike clustered indexes, in which every row can move at any time to a different physical location.

This difference affects non-clustered indexes as they refer to the rows in the heap or clustered index: in case of a heap, they can just store the physical location of that row in the non-clustered indexes (as they hardly ever change). If there is a clustered index, the non-clustered indexes store the clustering key.

If you are eventually using a non-clustered index, the effort to reach the actual table in either the heap or clustered index is very different: With a heap, you just need one physical IO, with a clustered index, you need to do a Clustered Index Seek, which is typically 3-5 logical IOs (depends on the table size).

If you have many non-clustered indexes and are using them without index-only scan (that means there follows a RID access), a clustered index may hurt performance considerably.

More details about that in this article I wrote:

http://use-the-index-luke.com/blog/2014-01/unreasonable-defaults-primary-key-clustering-key

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.