Given the following SQL Server table:
- Employee (ssn, name, dept, manager, salary)
where ssn is the primary key.
Suppose there are 30 employee records per disk block. Each employee belongs to one of the departments. Explain why you should or shouldn't put a non-clustering index on dept to speed up this query in the following two cases:
SELECT ssn
FROM Employee
WHERE dept = 'IT'
- when there are 50 departments
- when there are 5000 departments
My basic understanding of clustered vs. non-clustered indexes in SQL Server is that clustered indexes should be used when there is a large amount of data to be returned, as they will initially sort the table by that index. Therefore, I believe that in the second scenario, with 5000 departments, you should not put a non-clustering index on dept to speed up the query.
I am confused about the first scenario because, as there are only 50 departments, does it really matter if a non-clustering or clustering index is used? The only reason I can think it might matter is if a clustering index takes extra time to first sort the data, while a non-clustering index does not.
Which clustering or non-clustering index should I use in these two cases?