1

A brief overview of the scenario:

My database uses GUID's as primary keys, and, for what I've been reading, it seems like it's somewhat bad to have clustered indexes on GUID's (increases fragmentation, slows down inserts etc.). My project uses hibernate so we usually deal with jpql and fetching of full entities (a lot of queries end up turning into select p.* from person p [...])

I would like to know if it would be a good approach to create non-clustered indexes covering all columns of a table (in order to avoid RID lookups, etc.).

Thanks for the help, already!

6
  • 1
    No sense in adding non-clustered index on all columns if you aren't going to be searching by them. If you are going to be just searching by GUID then just make a non-clustered index on the GUID column. Commented Jul 3, 2014 at 19:11
  • No, no, sorry. Maybe I didn't make myself clear. The point is not creating non-clustered indexes in every column, but creating a non-clustered index in a SINGLE column (which I would be using a lot for search or joinings) and including every other column at the leaf level of this index. For instance: I have a table Person with columns like (person_uid, agency_uid, foo_uid, birth date, foo, bar) The PK is person_uid. Instead of creating a clustered-index on person_uid, I would like to create a non-clustered and include (agency_uid, foo_uid, birth date, foo, bar) Commented Jul 3, 2014 at 19:22
  • 3
    A non clustered index, including all columns, on a heap is worse than simply having a clustered index. The NCI is bigger than the equivalent CI would be as it also stores the additional RID plus you have two copies of all the data. The NCI would be just as prone to fragmentation as the CI would be. This isn't an issue only encountered in clustered indexes. If you are only seeking single rows fragmentation itself won't be much of an issue for you anyway though you may want to look at fill factor to reduce page splits. Commented Jul 3, 2014 at 19:46
  • Ooh... So even if disk space wasn't an issue, it would still be better to have a clustered index. The thing is: the way the database was designed, the default SQL behavior (creating a clustered index in the Primary Key) was disabled. So the vast majority of our tables doesn't have a Clustered Index, and lots of queries make RID lookups. I was searching for a solution to that. I think I'll at least include the PK at the leaf level of every NCI. Commented Jul 3, 2014 at 19:48
  • Just add a ID INT IDENTITY(1,1) column and make that the primary key and clustered index! That would probably make the most sense since clustered tables are more efficient in general than heaps for every operation ... Commented Jul 3, 2014 at 20:45

2 Answers 2

2

No, it is not a good approach. It sounds like you've already read that having the clustered index on a GUID is a bad idea. Instead, create an int (or bigint, if necessary) identity field and make that the clustered index, unless another field makes more sense. Then just create a nonclustered index on the GUID field, and let SQL do an RID lookup for each query that uses it. This way you can avoid fragmentation and slow inserts/updates/deletes.

Sign up to request clarification or add additional context in comments.

15 Comments

Dave, imagine that I just can't change the architecture at this level (create another identifier). Let's say I have to stick with UID's. What else could I do?
Put the clustered index on a non-unique set of fields. Not the best solution, but that may be what you're stuck with.
Why in a non-unique? Shouldn't indexes be preferably unique?
Preferably yes, but if you don't have a field or set of fields that will be unique, you may be forced to have non-unique values.
The problem is, the way the database was designed, the default SQL behavior (creating a clustered index in the Primary Key) was disabled. So a lot (and I really mean a lot) of queries make RID Lookups. The other indexes doesn't even include the primary key (which they would do by default if the PK had a clustered index).
|
0

Premature optimization is a bad idea. Is the data size cost and effort added to inserts, updates, and deletes worth adding the index? Unless you measure and test performance and the impact of your index, you won't know. Look at the queries that read the table and see which, if any, are unacceptably long. Then tune that specific query.

7 Comments

The problem is, the way the database was designed, the default SQL behavior (creating a clustered index in the Primary Key) was disabled. So a lot (and I really mean a lot) of queries make RID Lookups. The other indexes doesn't even include the primary key (which they would do by default if the PK had a clustered index).
@DiegoMartins I see. Adding an index like you suggested would still double the data stored.
Yes, that's true. But I've been told that db size is not really an issue. I was thinking about taking this approach only on some of our tables.
@DiegoMartins I can see that working. RID lookups are only expensive in bulk, so for queries that only return a single row it won't be worth it. For queries that return large data sets, then I can see a chance of it being helpful, but I would want to test first.
Yes, sure! Not always I'm bringing lots of results, but mainly the application is hitting the database a lot of times with the same queries (changing the parameters). And these queries have somewhat of "unnecessary" RID lookups. So I think it would still be good in this scenario?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.