1

I have a table with transactions, timestamps and users.

CREATE TABLE [dbo].[Transactions]
(
    [transaction_ts] [datetime] NULL,
    [user_id] [bigint] NULL,
    [transaction_id] [bigint] NULL,
    [item] [varchar](50) NULL
)

For each user_id, I need to select all transactions they made between their first transaction and 72 hours later.

--get first and last timestamps for range
DROP TABLE IF EXISTS #first;

SELECT mt.transaction_ts as first_trans,mt.user_id 
INTO #first
FROM Transactions mt 
INNER JOIN
    (SELECT user_id, MIN(transaction_ts) MinDate
     FROM Transactions
     GROUP BY user_id) t ON mt.user_id = t.user_id AND mt.transaction_ts = t.MinDate;

ALTER TABLE #first
ADD first_trans_plus_72 datetime;

UPDATE #first 
SET first_trans_plus_72 = DATEADD(hour, 72, first_trans)

--loop through user_id and select ranges using variables
DECLARE @Table TABLE (user_id bigint, Id int identity(1,1));

INSERT INTO @Table 
    SELECT DISTINCT user_id 
    FROM #first;

DECLARE @max int;
DECLARE @SQL VARCHAR(MAX);
DECLARE @user_id VARCHAR(max);
DECLARE @first VARCHAR(max);
DECLARE @first_trans_plus_72 VARCHAR(max);
DECLARE @id int = 1;

SELECT @max = MAX(Id) FROM@Table;

WHILE (@id <= @max)
BEGIN
    SELECT @user_id = user_id FROM @Table WHERE Id = @id
    SELECT @first = first_trans FROM #First WHERE user_id = @user_id
    SELECT @first_trans_plus_72 = first_trans_plus_72 FROM #First WHERE user_id = @user_id
    SET @SQL = 'select * from Transactions 
                where transaction_ts between ' + @first + ' and ' + @first_trans_plus_72 + ' 
                and user_id = ' + @user_id + ';'
    PRINT(@SQL)
    EXEC(@SQL)
    SET @id = @id +1
END

This produces the right logical sql but the datetime variables are strings so the query errors out. I tried setting the datetime variables (@first and @first_trans_plus_72 as datetime but this resulted in a conversion error.

Is there a simpler way to do this?

1 Answer 1

5

Why would you use a loop for this when you can use a simple query?

select t.*
from (select t.*, min(transaction_ts) over (partition by user_id) as min_tts
      from transactions t
     ) t
where t.transaction_ts <= dateadd(hour, 72, min_tts);

In general, it is better to write code using set-based operations. It is simpler and performs much, much better.

You can incorporate this into an update, but I don't think that is necessary. The above selects the transactions. You can use group by user_id to summarize them -- say to count them or to sum the values.

Sign up to request clarification or add additional context in comments.

1 Comment

The answer is that I am very bad at SQL. Also I didn't know about that partition thingy - very handy thankyou so much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.