1

I have a table that's only showing weekday values. It's grabbing these from a file that's imported only on the weekdays. I'm needing to also add in the weekend (or holidays) with the previously known day's value. I have asked this question when I was needing it to be used in MS Access. I'm now moving this database to SQL Server.

If you're wanting to see what worked for me in Access, you're more than welcome to check out the link.

I have attempted to adapt the MS Access SQL to SQL Server with:

SELECT a1.IDNbr, a1.Balance, CONVERT(int, DAY(a1.BalDate)) + 3

FROM tblID a1 INNER JOIN tblID a2 ON (CONVERT(int, DAY(a1.BalDate)) + 4 = a2.BalDate) AND (a1.IDNbr = a2.IDNbr)

WHERE NOT EXISTS (
    SELECT *
    FROM tblID a3
    WHERE a3.IDNbr = a1.IDNbr AND a3.BalDate = CONVERT(int, DAY(a1.BalDate)) + 3) AND (DATEPART(W, a1.BalDate) = 6
);

However, I'm getting the Error:

Msg 206, Level 16, State 2, Line 4

Operand type clash: date is incompatible with int

Question: How can I get this query (which I will be turning into an INSERT statement) to show all the missing days within my data and to assign the value of the last known day to the missing days?

Data that I have(starting on Friday):

+-------------------------------------+
|ID |  IDNbr |   Balance  |  BalDate  |
+-------------------------------------+
|001|   91   |     529    | 1/5/2018  |
|002|   87   |     654    | 1/5/2018  |
|003|   45   |     258    | 1/5/2018  |

|004|   91   |     611    | 1/8/2018  |
|005|   87   |     753    | 1/8/2018  |
|006|   45   |     357    | 1/8/2018  |
|...|   ..   |     ...    | ........  | 
+-------------------------------------+
'BalDate then skips past 1/6/2018 and 1/7/2018 to 1/8/2018

Data that I'm needing:

+-------------------------------------+
|ID |  IDNbr |   Balance  |  BalDate  |
+-------------------------------------+
|001|   91   |     529    | 1/5/2018  |
|002|   87   |     654    | 1/5/2018  |
|003|   45   |     258    | 1/5/2018  |

|004|   91   |     529    | 1/6/2018  |
|005|   87   |     654    | 1/6/2018  |
|006|   45   |     258    | 1/6/2018  |

|007|   91   |     529    | 1/7/2018  |
|008|   87   |     654    | 1/7/2018  |
|009|   45   |     258    | 1/7/2018  |

|010|   91   |     611    | 1/8/2018  |
|011|   87   |     753    | 1/8/2018  |
|012|   45   |     357    | 1/8/2018  |
|...|   ..   |     ...    | ........  |
+-------------------------------------+
'I'm needing it to add the Saturday(1/6/2018) and Sunday(1/7/2018) before continuing on to 1/8/2018

Any help would be appreciated. Thank you in advance!

If there are downvotes, I ask that you please explain why you are downvoting so I may correct it!

5
  • 1
    It would better if you provide some sample data to try with and expected output! Commented Oct 4, 2018 at 15:14
  • @PrashantPimpale , I've added the example data! Commented Oct 4, 2018 at 15:20
  • You are getting the error because of your join criteria: (CONVERT(int, DAY(a1.BalDate)) + 4 = a2.BalDate). The part of the left yields an int and can't be compared to a date. If you are trying to add 4 days to the date, try using DATEADD instead. You have the same problem in the where clause of your subquery. Commented Oct 4, 2018 at 15:30
  • 2
    @Symon . . . I am unclear what the question is. Do you want your MS Access query updated for SQL Server or do you have specific data and desired results that you want to achieve in SQL Server? Commented Oct 4, 2018 at 15:39
  • @GordonLinoff , Sorry for the confusion! The link was so help provide context-- I've updated my example data with the correct lay out of what I have and what I'm needing. I have specific data and desired results I wish to achieve in SQL Server. I was using the MS Access query to hopefully get me in the right direction. Commented Oct 4, 2018 at 15:53

2 Answers 2

2

Ok, you're going to need the CalTable() function from Bernd's answer. We're going to use it to create a list of all calendar dates between the MIN(BalDate) and the MAX(BalDate) in tblID. We're also going to CROSS JOIN that with the list of DISTINCT IDNbr values, which I assume is the PK of tblID.

Let's create some sample data.

CREATE TABLE #tblID (ID VARCHAR(3), IDNbr INT, Balance INT, BalDate DATE)

INSERT INTO #tblID
(
    ID
    ,IDNbr
    ,Balance
    ,BalDate
)
VALUES
('001',91,529,'1/5/2018'),
('002',87,654,'1/5/2018'),
('003',45,258,'1/5/2018'),

('004',91,611,'1/8/2018'),
('005',87,753,'1/8/2018'),
('006',45,357,'1/8/2018')

Next, we're going to INSERT new records into #tblID for the missing days. The magic here is in the LAG() function, which can looks at a previous row's data. We give it an expression for the offset value, based on the difference between missing date and the last date with data.

;WITH IDs AS
(
    SELECT DISTINCT
        IDNbr 
    FROM #tblID
)
,IDDates AS
(
    SELECT 
        BalDate = c.[Date]
        ,i.IDNbr
    FROM [CalTable]((SELECT MIN(BalDate) FROM #tblID), (SELECT MAX(BalDate) FROM #tblID)) c
    CROSS APPLY IDs i
)
,FullResults AS
(
    SELECT 
        i.BalDate
        ,i.IDNbr 
        ,Balance = CASE WHEN t.Balance IS NOT NULL THEN t.Balance 
                    ELSE LAG(t.Balance,
                                            DATEDIFF(
                                                    DAY
                                                    ,(SELECT MAX(t1.BalDate) FROM #tblID t1 WHERE t1.IDNbr = i.IDNbr AND t1.BalDate <= i.BalDate GROUP BY t1.IDNbr)
                                                    ,i.BalDate
                                                )
                    ) OVER (PARTITION BY i.IDNbr ORDER BY i.BalDate ASC) 
                    END 
    FROM IDDates i
    LEFT JOIN #tblID t ON t.BalDate = i.BalDate AND t.IDNbr = i.IDNbr
)
INSERT INTO #tblID
(
    IDNbr
    ,Balance
    ,BalDate
)
SELECT 
    f.IDNbr
    ,f.Balance
    ,f.BalDate
FROM FullResults f
LEFT JOIN #tblID t ON t.IDNbr = f.IDNbr AND t.BalDate = f.BalDate
WHERE t.IDNbr IS NULL

At this point, if we didn't care about the ID field, which appears to be a 3-character string representation of the row number, we'd be good. However, while I don't think it's a good practice to use a string in this manner, I'm also not one to comment on someone else's business requirements that I am not privy to.

So let's assume we have to update the ID field to match the expected output. We can do that like this:

;WITH IDUpdate AS
(
    SELECT 
        ID = RIGHT('000' + CAST(ROW_NUMBER() OVER (ORDER BY BalDate ASC, IDNbr DESC) AS VARCHAR), 3)
       ,t.IDNbr
       ,t.Balance
       ,t.BalDate 
    FROM #tblID t
)
UPDATE t
SET t.ID = i.ID
FROM #tblID t
INNER JOIN IDUpdate i ON i.IDNbr = t.IDNbr AND i.BalDate = t.BalDate

Now if you query your updated table, you'll get the following:

SELECT 
    ID
    ,IDNbr
    ,Balance
    ,BalDate 
FROM #tblID
ORDER BY BalDate ASC, IDNbr DESC

Output:

ID  | IDNbr | Balance | BalDate
------------------------------
001 | 91    | 529     | 2018-01-05
002 | 87    | 654     | 2018-01-05
003 | 45    | 258     | 2018-01-05
004 | 91    | 529     | 2018-01-06
005 | 87    | 654     | 2018-01-06
006 | 45    | 258     | 2018-01-06
007 | 91    | 529     | 2018-01-07
008 | 87    | 654     | 2018-01-07
009 | 45    | 258     | 2018-01-07
010 | 91    | 611     | 2018-01-08
011 | 87    | 753     | 2018-01-08
012 | 45    | 357     | 2018-01-08
Sign up to request clarification or add additional context in comments.

5 Comments

I probably should've note (sorry I left this out too!) that the ID field was just the auto number field. It's not part of the table, it's just the row line. I have this running now! As it finishes and I can verify the output, I'll mark as answer! Next step for me will be trying to optimize as it'll have to run at least once per Monday in order to get the last week! Thank you so much!
Then skip the IDUpdate part of the query. Also, you don't have to use MAX(BalDate) and MIN(BalDate) from tblID. You can use the start and end date of the previous week to insert records only for dates that are missing from the previous week. That would go a long way toward making sure the query runs in a timely manner.
@digital.aaron , how would I be able to make this only use the past two weeks (or past 14 days) instead of all days in my data? I understand I'm needing to change MAX(BalDate) and MIN(BalDate). However, when I attempt to use DATEADD(dw, -14, BalDate) , I get the error: Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression. Any advice for this? I understand the question is already closed, I would just appreciate any further help!
@Symon You're on the right track. You'd change the the parameters you feed the [CalTable] function. To use the last 14 days, you'd replace the call to [CalTable] in the CTE IDDates. The changed line would look like this: FROM [CalTable](DATEADD(day, -14, GETDATE()), GETDATE()) c.
@digital.aaron , got it. Thank you so much for the additional help!
1

Here is a samples for the linked function:

create FUNCTION [dbo].[CalTable]
(
@startDate date,
@endDate date
)
RETURNS
@calender TABLE
(
    [Date] date not null primary key CLUSTERED,
    isMondayToFriday bit not null
)
AS
BEGIN
    declare @currentday date = @startDate;
    declare @isMondayToFriday bit;

    while (@currentday<=@endDate)
    begin
    -- respect DATEFIRST depending on language settings
      if (DATEPART(dw, @currentday)+@@DATEFIRST-2)%7+1>5
        set @isMondayToFriday = 0
      else
        set @isMondayToFriday = 1

      insert into @calender values (@currentday, @isMondayToFriday);

      set @currentday = DATEADD(D, 1, @currentday);
    end

    RETURN
END

GO


select * from [CalTable]({d'2018-01-01'}, {d'2018-02-03'});

use this for find the gaps.

6 Comments

This is not a bad direction but using a tally table instead of that loop would provide a massive improvement for performance. Here is a great article explaining how it can be used to replace loops. That would have the added benefit of turning this function into an inline table valued function which is crazy fast. These multi statement table functions are actually quite horrible for speed.
I'm not sure how I'd be able to apply this to my tblID and have it use the data within the entire table.
@Symon, you would call this to create the list of all days, then you would LEFT JOIN to tblID.BalDate. You'd then get all of the days, and the data from tblID if it exists. If there were no data for a particular day, then you'd just get NULLs for any columns you selected from tblID.
@digital.aaron , I'm needing the values of the last known day to be assigned to the missing days aswell, instead of them just being NULL. I though I had put that into the question, my apologies. I updated the question! -- I also don't think I understand how I'm supposed to join the two (tblID and [CalTable]) since [CalTable] requires arguments? I apologize, I've never used Functions in SQL before!
@SeanLange yes, you are right when you are talking about bad performance. Also the result of this function is slow in joins. This function was only a sample how to do the ms-access function in T-SQL. We never talked about performance. I solved the performance problem by creating a full filled calendar table with 50 years in it. My calendar table also contains other non business days.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.