1

I'm generating [Dim_Calendar] table for the data warehouse. I've developed a query below that takes 29 seconds to execute & inserts 27k rows. I would like to optimize it if it is possible. I do understand that while loop is not helping performance and I do not know how to replace it to achieve the same results.

I'm using SQL Server 2012 BI edition.

IF EXISTS(SELECT * FROM sys.indexes WHERE name='PK_Dim_Calendar_1' AND object_id = OBJECT_ID('Dim_Calendar'))
BEGIN
    ALTER TABLE [dbo].[Dim_Calendar] DROP CONSTRAINT [PK_Dim_Calendar_1]
END

SET DATEFIRST 1--Sets Monday as 1st day of the week.
DECLARE @today DATETIME = ( SELECT  GETDATE())
DECLARE @start DATETIME = DATEADD(dd, 1, (SELECT Max(date) FROM Dim_Calendar))

IF @start IS NULL
BEGIN
    INSERT INTO [dbo].[Dim_Calendar]
    VALUES (19000101, '1900-01-01', 'Monday',   1   ,'Unknown', 1, 'January', 1900, 1)
    SET @start = '1940-01-01'
END


DECLARE @end DATETIME   = (SELECT DATEFROMPARTS(YEAR(@today), 12, 31))


WHILE @start <= @end
    BEGIN

        INSERT INTO [dbo].[Dim_Calendar]
            SELECT
                YEAR(@start) * 10000 + MONTH(@start) * 100 + DAY(@start)
                ,@start
                ,DATENAME(dw, @start)
                ,DATEPART(wk, @start)
                ,'w/c ' + CONVERT(char(8), DATEADD(dd, 1 - DATEPART(dw, @start), @start), 3)
                ,DATEPART(mm, @start)
                ,DATENAME(MONTH, @start)
                ,YEAR(@start)
                ,DATEPART(QQ, @start)
        SET @start = DATEADD(dd, 1, @start)
    END


ALTER TABLE [dbo].[Dim_Calendar] ADD  CONSTRAINT [PK_Dim_Calendar_1] PRIMARY KEY CLUSTERED 
(
    [FullDateID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

3 Answers 3

2

The query is slow because you are using a loop to create values one at a time.

When dealing with value sequences it's useful to have a 'Numbers' table with numbers from 1 to however many items you want. By selecting or joining with that table you can generate sequences, identify gaps etc. Aaron Bertrand wrote a series of articles on how to create a Numbers table and use it to create a set of dates.

Assuming you already have such a Numbers(n), creating a Calendar table is as simple as:

DECLARE @start DATE = '2005-07-01'; 
DECLARE @end DATE = DATEADD(DAY, -1, DATEADD(YEAR, 30, @start));

DECLARE @days int = DATEDIFF(DAY, @start, @end) + 1

SELECT TOP (@days) 
  d = CONVERT(DATE, DATEADD(DAY, n-1, @start))
INTO dbo.Calendar
FROM dbo.Numbers ORDER BY n;

In your case the SELECT part would be something like:

;WITH Dates (d)
AS (
    SELECT TOP (@days) 
        d = CONVERT(DATE, DATEADD(DAY, n-1, @start))
    FROM dbo.Numbers 
    ORDER BY d)
 SELECT
    YEAR(d) * 10000 + MONTH(d) * 100 + DAY(d)
    ,d
    ,DATENAME(dw, d)    ,DATEPART(wk, d)
    ,'w/c ' + CONVERT(char(8), DATEADD(dd, 1 - DATEPART(dw, d), d), 3)
    ,DATEPART(mm, d)
    ,DATENAME(MONTH, d)
    ,YEAR(d)
    ,DATEPART(QQ, d)
from Dates

To generate the Numbers table you can use the following statement:

SELECT TOP (1000000) n = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
INTO dbo.Numbers
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);

CREATE UNIQUE CLUSTERED INDEX n ON dbo.Numbers(n)
Sign up to request clarification or add additional context in comments.

Comments

2

This should work for you (quick too):

;WITH [dates] 
AS 
   (SELECT @start AS [date] 
    UNION ALL 
    SELECT DATEADD(d, 1, [date]) AS [date] 
    FROM [dates] 
    WHERE [date] < @end) 

SELECT 
      YEAR([date]) * 10000 + MONTH([date]) * 100 + DAY([date])
                ,[date]
                ,DATENAME(dw, [date])
                ,DATEPART(wk, [date])
                ,'w/c ' + CONVERT(char(8), DATEADD(dd, 1 - DATEPART(dw, [date]), [date]), 3)
                ,DATEPART(mm, [date])
                ,DATENAME(MONTH, [date])
                ,YEAR([date])
                ,DATEPART(QQ, [date]) 
FROM [dates] 
OPTION (MAXRECURSION 32747);

2 Comments

This is 14 times slower than the Numbers table for small numbers, 35 times slower for large numbers, according to Aaron Bertrand's benchmarks. You've replace the loop with recursion but you still produce one value at a time without benefit of an index
@PanagiotisKanavos The OP asked for a way to speed up his inserts, this may not be the fastest approach but on my dev box it still only took 643ms to complete which is significantly faster than his 29s.
0
Try this

IF EXISTS(SELECT *
          FROM   sys.indexes
          WHERE  name = 'PK_Dim_Calendar_1'
                 AND object_id = OBJECT_ID('Dim_Calendar'))
  BEGIN
      ALTER TABLE [dbo].[Dim_Calendar]
        DROP CONSTRAINT [PK_Dim_Calendar_1]
  END

SET DATEFIRST 1--Sets Monday as 1st day of the week.
DECLARE @today DATETIME = (SELECT GETDATE())
DECLARE @start DATETIME = DATEADD(dd, 1, (SELECT Max(date)
                  FROM   Dim_Calendar))

IF @start IS NULL
  BEGIN
      INSERT INTO [dbo].[Dim_Calendar]
      VALUES      (19000101,
                   '1900-01-01',
                   'Monday',
                   1,
                   'Unknown',
                   1,
                   'January',
                   1900,
                   1)

      SET @start = '1940-01-01'
  END

DECLARE @end DATETIME = (SELECT DATEFROMPARTS(YEAR(@today), 12, 31));

WITH T(start)
     AS (SELECT @start
         UNION ALL
         SELECT start + 1
         FROM   T
         WHERE  T.start < @end)
INSERT INTO [dbo].[Dim_Calendar]
SELECT YEAR(start) * 10000 + MONTH(start) * 100 + DAY(start),
       start,
       DATENAME(dw, start),
       DATEPART(wk, start),
       'w/c '
       + CONVERT(CHAR(8), DATEADD(dd, 1 - DATEPART(dw, start), start), 3),
       DATEPART(mm, start),
       DATENAME(MONTH, start),
       YEAR(start),
       DATEPART(QQ, start)
FROM   T
OPTION (MAXRECURSION 0);

ALTER TABLE [dbo].[Dim_Calendar]
  ADD CONSTRAINT [PK_Dim_Calendar_1] PRIMARY KEY CLUSTERED ( [FullDateID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

GO 

3 Comments

Why and how this answer helps the OP to achieve the goal? Please explain your answer not only for the OP, but for further reades too. Thanks. For more info please read How to Answer
Our ultimate aim is to remove the while loop. If we look at the actual code the @start value has been incremented even that is the crucial part in this code. We can do this with the help of a CTE instead of while loop to speed up the insert statement.
That's the same recursive CTE posted by Steve Ford.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.