Clustered index vs. Non-clustered index including ALL columns (SQL Server)

Question

A brief overview of the scenario:

My database uses GUID's as primary keys, and, for what I've been reading, it seems like it's somewhat bad to have clustered indexes on GUID's (increases fragmentation, slows down inserts etc.). My project uses hibernate so we usually deal with jpql and fetching of full entities (a lot of queries end up turning into select p.* from person p [...])

I would like to know if it would be a good approach to create non-clustered indexes covering all columns of a table (in order to avoid RID lookups, etc.).

Thanks for the help, already!

No sense in adding non-clustered index on all columns if you aren't going to be searching by them. If you are going to be just searching by GUID then just make a non-clustered index on the GUID column. — mclaassen
– mclaassen, Commented Jul 3, 2014 at 19:11
No, no, sorry. Maybe I didn't make myself clear. The point is not creating non-clustered indexes in every column, but creating a non-clustered index in a SINGLE column (which I would be using a lot for search or joinings) and including every other column at the leaf level of this index. For instance: I have a table Person with columns like (person_uid, agency_uid, foo_uid, birth date, foo, bar) The PK is person_uid. Instead of creating a clustered-index on person_uid, I would like to create a non-clustered and include (agency_uid, foo_uid, birth date, foo, bar) — Diego Martins
– Diego Martins, Commented Jul 3, 2014 at 19:22
A non clustered index, including all columns, on a heap is worse than simply having a clustered index. The NCI is bigger than the equivalent CI would be as it also stores the additional RID plus you have two copies of all the data. The NCI would be just as prone to fragmentation as the CI would be. This isn't an issue only encountered in clustered indexes. If you are only seeking single rows fragmentation itself won't be much of an issue for you anyway though you may want to look at fill factor to reduce page splits. — Martin Smith
– Martin Smith, Commented Jul 3, 2014 at 19:46
Ooh... So even if disk space wasn't an issue, it would still be better to have a clustered index. The thing is: the way the database was designed, the default SQL behavior (creating a clustered index in the Primary Key) was disabled. So the vast majority of our tables doesn't have a Clustered Index, and lots of queries make RID lookups. I was searching for a solution to that. I think I'll at least include the PK at the leaf level of every NCI. — Diego Martins
– Diego Martins, Commented Jul 3, 2014 at 19:48
Just add a ID INT IDENTITY(1,1) column and make that the primary key and clustered index! That would probably make the most sense since clustered tables are more efficient in general than heaps for every operation ... — marc_s
– marc_s, Commented Jul 3, 2014 at 20:45

Dave.Gugg · Accepted Answer · 2014-07-03 19:23:11Z

2

No, it is not a good approach. It sounds like you've already read that having the clustered index on a GUID is a bad idea. Instead, create an int (or bigint, if necessary) identity field and make that the clustered index, unless another field makes more sense. Then just create a nonclustered index on the GUID field, and let SQL do an RID lookup for each query that uses it. This way you can avoid fragmentation and slow inserts/updates/deletes.

answered Jul 3, 2014 at 19:23

Dave.Gugg

6,8013 gold badges26 silver badges45 bronze badges

Sign up to request clarification or add additional context in comments.

15 Comments

Diego Martins Over a year ago

Dave, imagine that I just can't change the architecture at this level (create another identifier). Let's say I have to stick with UID's. What else could I do?

Dave.Gugg Over a year ago

Put the clustered index on a non-unique set of fields. Not the best solution, but that may be what you're stuck with.

Diego Martins Over a year ago

Why in a non-unique? Shouldn't indexes be preferably unique?

Dave.Gugg Over a year ago

Preferably yes, but if you don't have a field or set of fields that will be unique, you may be forced to have non-unique values.

Diego Martins Over a year ago

The problem is, the way the database was designed, the default SQL behavior (creating a clustered index in the Primary Key) was disabled. So a lot (and I really mean a lot) of queries make RID Lookups. The other indexes doesn't even include the primary key (which they would do by default if the PK had a clustered index).

|

Vulcronos · Accepted Answer · 2014-07-03 19:27:53Z

0

Premature optimization is a bad idea. Is the data size cost and effort added to inserts, updates, and deletes worth adding the index? Unless you measure and test performance and the impact of your index, you won't know. Look at the queries that read the table and see which, if any, are unacceptably long. Then tune that specific query.

answered Jul 3, 2014 at 19:27

Vulcronos

3,4563 gold badges19 silver badges25 bronze badges

7 Comments

Diego Martins Over a year ago

The problem is, the way the database was designed, the default SQL behavior (creating a clustered index in the Primary Key) was disabled. So a lot (and I really mean a lot) of queries make RID Lookups. The other indexes doesn't even include the primary key (which they would do by default if the PK had a clustered index).

Vulcronos Over a year ago

@DiegoMartins I see. Adding an index like you suggested would still double the data stored.

Diego Martins Over a year ago

Yes, that's true. But I've been told that db size is not really an issue. I was thinking about taking this approach only on some of our tables.

Vulcronos Over a year ago

@DiegoMartins I can see that working. RID lookups are only expensive in bulk, so for queries that only return a single row it won't be worth it. For queries that return large data sets, then I can see a chance of it being helpful, but I would want to test first.

Diego Martins Over a year ago

Yes, sure! Not always I'm bringing lots of results, but mainly the application is hitting the database a lot of times with the same queries (changing the parameters). And these queries have somewhat of "unnecessary" RID lookups. So I think it would still be good in this scenario?

|

Collectives™ on Stack Overflow

Clustered index vs. Non-clustered index including ALL columns (SQL Server)

2 Answers 2

15 Comments

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

15 Comments

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related