2

I have a 5 year data set partitioned into quarterly tables. I also have a master view that joins them all together. When a user needs more than one quarter of data they often use the master view instead of joining multiple quarters together.

My question is, would a table valued function which accepts a date range and returns only the records from the necessary partitions be faster than querying the entire master view?

This is my current view definition:

ALTER VIEW [dbo].[loandetails_test]
AS
SELECT     *
FROM         loandetails05
where year(date) = 2005
UNION
SELECT     *
FROM         loandetails06
where year(date) = 2006
UNION
SELECT     *
FROM         loandetails07
where year(date) = 2007
UNION
SELECT     *
FROM         loandetails08
where year(date) = 2008
UNION
SELECT     *
FROM         loandetails1q09
where date >= '1/1/2009' and date < '4/1/2009'
UNION
SELECT     *
FROM         loandetails2q09
where date >= '4/1/2009' and date < '7/1/2009'
UNION
SELECT     *
FROM         loandetails3q09
where date >= '7/1/2009' and date < '10/1/2009'
UNION
SELECT     *
FROM         loandetails4q09
where date >= '10/1/2009' and date < '1/1/2010'
UNION
SELECT     *
FROM         loandetails1q10
where date >= '1/1/2010' and date < '4/1/2010'
UNION
SELECT     *
FROM         loandetails2q10
where date >= '4/1/2010' and date < '7/1/2010'
UNION
SELECT     *
FROM         loandetails3q10
where date >= '7/1/2010' and date < '10/1/2010'
UNION
SELECT     *
FROM         loandetails4q10
where date >= '10/1/2010' and date < '1/1/2011'
union
SELECT     *
FROM         loandetails_CURRENT
where date >= '1/1/2011' and date < '4/1/2011'


GO
1
  • Possibly. It would of course depend greatly on how the table valued function was written. However the only way to know is to measure performance using the master view and using the table-valued function. Commented Feb 10, 2011 at 18:47

1 Answer 1

2

The answer should be a solid no.

Partitions are set up with implicit criteria, so if you are doing it by date (quarter), SQL Server already knows which partitions will satisfy the query (assuming the query will have a date filter). Check the execution plan which will confirm a stream-merge between two (or as many as involved) partitions.

I have a case where tables from N databases (yes one per silo) are joined in a master view, like you have. The master view uses a filter for each one, specifically it looks like this

select source=1, col1, col2, col3
from db1.dbo.tbl
union all
select source=2, col1, col2, col3
from db2.dbo.tbl
etc

Any query that asks for where source in (2,3) automatically recognizes that only 2 dbs need to be searched, and the execution plan reveals as much.

If you manually created date-partitioned queries, you can

  1. have an index on the date range, within each table
  2. force the optimizer to recognize the partitioning

Here is a working example (even without indexes). Notice that Q1 and Q4 are not even showing in the plan. Disclosure: SQL Server 2008 R2 Express

select dateadd(d, number, '20100101') TheDate, *
into Q1data
from master..spt_values
where type='p' and number between 1 and 370
and datepart(quarter, dateadd(d, number, '20100101')) =1

select dateadd(d, number, '20100101') TheDate, *
into Q2data
from master..spt_values
where type='p' and number between 1 and 370
and datepart(quarter, dateadd(d, number, '20100101')) =2

select dateadd(d, number, '20100101') TheDate, *
into Q3data
from master..spt_values
where type='p' and number between 1 and 370
and datepart(quarter, dateadd(d, number, '20100101')) =3

select dateadd(d, number, '20100101') TheDate, *
into Q4data
from master..spt_values
where type='p' and number between 1 and 370
and datepart(quarter, dateadd(d, number, '20100101')) =4
GO

create view Ydata
with schemabinding
as
select TheDate, name, number, TYPE, LOW, high, status
from dbo.Q1Data where TheDate >= '20100101' and TheDate < '20100401'
union all
select TheDate, name, number, TYPE, LOW, high, status
from dbo.Q2Data where TheDate >= '20100401' and TheDate < '20100701'
union all
select TheDate, name, number, TYPE, LOW, high, status
from dbo.Q3Data where TheDate >= '20100701' and TheDate < '20101001'
union all
select TheDate, name, number, TYPE, LOW, high, status
from dbo.Q4Data where TheDate >= '20101001' and TheDate < '20110101'
GO

select * from YData where TheDate between '20100601' and '20100831'

Query Plan

enter image description here

Re: Updated question

When dealing with date ranges, NEVER (with few exceptions) use a function on the date column. This requires the function to be run against ALL records in the table before comparing to the other side.

where year(date) = 2005
===> means scan the table, for each row take the year, compare to 2005

Better to write as

where date >= '20050101' and date < '20060101'
===> means given a date range, use the index to seek the range
Sign up to request clarification or add additional context in comments.

6 Comments

My query plan show index seeks on partitions not necessary for the date range in the query. I should specify that these are not partitions created by SQL Server, as I do not have Enterprise Edition. They were manually created by me.
I may have misunderstood your response, but I tried building a view using the format you gave. I specified a date range for each select in the view definition, and the query plan still shows seeks on out of range partitions. The cost is 0%, but it's still looking at them. There are also date indexes on all partitions.
See my edit with the view definition. When I view the execution plan for select * from loandetails_test where date between '20090501' and '20101101' I still see all partitions 2005 - current being referenced. Could this be due to my (very old) SQL 2000 platform?
@Colin - see end of updated answer. It could be 2000 (which didn't support the concept of partitioning) but even if it considered the partitions, the date filters correctly set up will fizz very quickly when index is used.
Yes, I imagined the 0% cost meant it was barely looking at the other partitions. As for the note about not using functions, that's good to know. Fortunately this was just a test case I built, the production version still has no date filters, and also shows a 0% cost on out of range partitions. Is it still worth it to update the production view with date ranges?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.