0

I have the data with users tracking time. The data is in segments and each row represent one segment. Here is the sample data

http://sqlfiddle.com/#!6/2fa61

How can I get the data on daily basis i.e. if a complete day is of 1440 minutes then I want to know how many minutes the user was tracked in a day. I also want to show 0 on the day when there is no data.

I am expecting the following output

desired output

11
  • Much like your other question on this topic you need to provide details. I also stated in that other question that to get rows that don't have data you need a table of dates or a tally table as the main table of your query. Commented Aug 26, 2015 at 19:13
  • I see you have deleted your other question on this topic. >.< Commented Aug 26, 2015 at 19:15
  • But I dont exactly know how can I create a table from firstdate 2015-02-19 to last update 2015-02-28 on the fly. Commented Aug 26, 2015 at 19:16
  • 1
    By now you should realize your questions are drawing lot of negative votes. Mainly because bad description of the problem, incomplete schema or not sample data. And you dont show any effort in show us what have you try. Commented Aug 26, 2015 at 20:36
  • 1
    Try this link. weblogs.sqlteam.com/jeffs/archive/2008/05/13/… Commented Aug 26, 2015 at 21:24

5 Answers 5

1
+50

Use table of numbers. I personally have a permanent table Numbers with 100K numbers in it.

Once you have a set of numbers you can generate a set of dates for the range that you need. In this query I'll take MIN and MAX dates from your data, but since you may not have data for some dates, it is better to have explicit parameters defining the range.

For each date I have the beginning and ending of a day - our grouping interval.

For each date we are searching among track rows for those that intersect with this interval. Two intervals (DayStart, DayEnd) and (StartTime, EndTime) intersect if StartTime < DayEnd and EndTime > DayStart. This goes into WHERE.

For each intersecting intervals we are calculating the range that belongs to both intervals: from MAX(DayStart, StartTime) to MIN(DayEnd, EndTime).

Finally, we group by day and sum up durations of all ranges.

I added a row to your sample data to test the case when interval covers the whole day. From 2015-02-14 20:50:43 to 2015-02-16 19:49:59. I chose this interval to be well before intervals in your sample, so that results for the dates in your example are not affected. Here is SQL Fiddle.

DECLARE @track table
(
Email varchar(20),
StartTime datetime,
EndTime datetime,
DurationInSeconds int,
FirstDate datetime,
LastUpdate datetime
);

Insert into @track  values ( 'ABC', '2015-02-20 08:49:43.000', '2015-02-20 14:49:59.000', 21616, '2015-02-19 00:00:00.000', '2015-02-28 11:45:27.000')
Insert into @track  values ( 'ABC', '2015-02-20 14:49:59.000', '2015-02-20 22:12:07.000', 26528, '2015-02-19 00:00:00.000', '2015-02-28 11:45:27.000')
Insert into @track  values ( 'ABC', '2015-02-20 22:12:07.000', '2015-02-21 07:00:59.000', 31732, '2015-02-19 00:00:00.000', '2015-02-28 11:45:27.000')
Insert into @track  values ( 'ABC', '2015-02-21 09:49:43.000', '2015-02-21 16:30:10.000', 24027, '2015-02-19 00:00:00.000', '2015-02-28 11:45:27.000')
Insert into @track  values ( 'ABC', '2015-02-21 16:30:10.000', '2015-02-22 09:49:30.000', 62360, '2015-02-19 00:00:00.000', '2015-02-28 11:45:27.000')
Insert into @track  values ( 'ABC', '2015-02-22 09:55:43.000', '2015-02-22 11:49:59.000', 5856, '2015-02-19 00:00:00.000', '2015-02-28 11:45:27.000')
Insert into @track  values ( 'ABC', '2015-02-22 11:49:10.000', '2015-02-23 08:49:59.000', 75649, '2015-02-19 00:00:00.000', '2015-02-28 11:45:27.000')
Insert into @track  values ( 'ABC', '2015-02-23 10:59:43.000', '2015-02-23 12:49:59.000', 6616, '2015-02-19 00:00:00.000', '2015-02-28 11:45:27.000')
Insert into @track  values ( 'ABC', '2015-02-23 12:50:43.000', '2015-02-24 19:49:59.000', 111556, '2015-02-19 00:00:00.000', '2015-02-28 11:45:27.000')
Insert into @track  values ( 'ABC', '2015-02-28 08:49:43.000', '2015-02-28 14:49:59.000', 21616, '2015-02-19 00:00:00.000', '2015-02-28 11:45:27.000')

Insert into @track  values ( 'ABC', '2015-02-14 20:50:43.000', '2015-02-16 19:49:59.000', 0, '2015-02-19 00:00:00.000', '2015-02-28 11:45:27.000')

.

;WITH
CTE_Dates
AS
(
    SELECT
        Email
        ,CAST(MIN(StartTime) AS date) AS StartDate
        ,CAST(MAX(EndTime) AS date) AS EndDate
    FROM @track
    GROUP BY Email
)
SELECT
    CTE_Dates.Email
    ,DayStart AS xDate
    ,ISNULL(SUM(DATEDIFF(second, RangeStart, RangeEnd)) / 60, 0) AS TrackMinutes
FROM
    Numbers
    CROSS JOIN CTE_Dates -- this generates list of dates without gaps
    CROSS APPLY
    (
        SELECT
            DATEADD(day, Numbers.Number-1, CTE_Dates.StartDate) AS DayStart
            ,DATEADD(day, Numbers.Number, CTE_Dates.StartDate) AS DayEnd
    ) AS A_Date -- this is midnight of each current and next day
    OUTER APPLY
    (
        SELECT
          -- MAX(DayStart, StartTime)
          CASE WHEN DayStart > StartTime THEN DayStart ELSE StartTime END AS RangeStart

          -- MIN(DayEnd, EndTime)
          ,CASE WHEN DayEnd < EndTime THEN DayEnd ELSE EndTime END AS RangeEnd
        FROM @track AS T
        WHERE
            T.Email = CTE_Dates.Email
            AND T.StartTime < DayEnd
            AND T.EndTime > DayStart
    ) AS A_Track -- this is all tracks that intersect with the current day
WHERE
    Numbers.Number <= DATEDIFF(day, CTE_Dates.StartDate, CTE_Dates.EndDate)+1
GROUP BY DayStart, CTE_Dates.Email
ORDER BY DayStart;

Result

Email    xDate         TrackMinutes
ABC      2015-02-14    189
ABC      2015-02-15    1440
ABC      2015-02-16    1189
ABC      2015-02-17    0
ABC      2015-02-18    0
ABC      2015-02-19    0
ABC      2015-02-20    910
ABC      2015-02-21    1271
ABC      2015-02-22    1434
ABC      2015-02-23    1309
ABC      2015-02-24    1189
ABC      2015-02-25    0
ABC      2015-02-26    0
ABC      2015-02-27    0
ABC      2015-02-28    360

You can still get TrackMinutes more than 1440, if two or more intervals in your data overlap.

update

You said in the comments that you have few rows in your data, where intervals do overlap and result has values more than 1440. You can wrap SUM into CASE to hide these errors in the data, but ultimately it is better to find these rows with problems and fix the data. You saw only few rows with values more than 1440, but there could be many more other rows with the same problem, which is not so visible. So, it is better to write a query that finds such overlapping rows and check how many there are and then decide what to do with them. The danger here is that at the moment you think that there are only few, but there could be a lot. This is beyond the scope of this question.

To hide the problem replace this line in the query above:

,ISNULL(SUM(DATEDIFF(second, RangeStart, RangeEnd)) / 60, 0) AS TrackMinutes

with this:

,CASE 
WHEN ISNULL(SUM(DATEDIFF(second, RangeStart, RangeEnd)) / 60, 0) > 1440
THEN 1440
ELSE ISNULL(SUM(DATEDIFF(second, RangeStart, RangeEnd)) / 60, 0) 
END AS TrackMinutes
Sign up to request clarification or add additional context in comments.

12 Comments

Why you got data for 25 and 26 February. I am expecting 0 TrackMinutes on these dates.
@Asbat, because I added a row to your sample data to test the case when interval covers the whole day. I have now adjusted the row that I added. Now it is from 2015-02-14 20:50:43 to 2015-02-16 19:49:59 and it doesn't interfere with your original example. Sorry for confusion.
I tried your query and in most cases it is working fine but for few dates it is exceeding 1440 (E.g. in 930 days of data for a user, I got all correct but also got wrong data for 9 days 1472, 1473, 1512, 2039, 1717, 1497, 2702, 1544, 1735). I am trying to update your sql fiddle to see if I can generate that error.
sqlfiddle.com/#!3/b23feb/1/0 Please check one of the instance in which the value is exceeding 1440 minutes. In the data at one instance the start time is repeated twice 2014-11-23 16:04:08.000. I am wondering if that is causing the confusion?
@Asbat, the query sums up all intervals, so if you have overlapping intervals, then overlapped periods would be counted twice. In SQL Fiddle 2014-11-23 16:04:08 --- 2014-11-23 16:37:48 overlaps with 2014-11-23 16:04:08 --- 2014-11-24 11:12:41, so 2,020 seconds (16:04:08 --- 16:37:48) is added twice to the 2014-11-23. Since your data is obviously not 100% clean, it is likely that you have overlapping data in more than 9 days, just in other days these extra seconds are not large enough to tip the sum above 1440 minutes.
|
1

I am making some guesses on the date ranges but this should be pretty close.

On my system I keep a view named cteTally which is my version of a tally table. Here is the code to create it.

create View [dbo].[cteTally] as

WITH
    E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
    cteTally(N) AS 
    (
        SELECT  ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
    )
select N from cteTally

Now we can utilize this to build your results. We just need to put in a couple other CTEs to get the date ranges established.

with DateRange as
(
    select MIN(FirstDate) as StartDate
        , MAX(LastUpdate) as EndDate 
    from track
)
, AllDates as
(
    select DateAdd(DAY, t.N - 1, StartDate) BaseDate
    from DateRange dr
    cross join cteTally t
    where t.N <= DATEDIFF(day, StartDate, EndDate) + 1
)

select t.Email
    , ad.BaseDate as xDate
    , t.DurationInSeconds as TrackMinutes
from AllDates ad
left join track t on cast(t.StartTime as date) = ad.BaseDate

2 Comments

Will it split the segment if the segment spans across two days? E.g if segment is 3 hours starting at 23.00 then i want 1 hour in the starting day and two hour in next day.
It would help immensely if you provided all the rules and explanations at the beginning. Are there more rules you haven't shared yet? What should the output look like for your sample data?
1
  1. Create a table variable for the dates
  2. Populate table in a WHILE loop
  3. Cross join to tracker data with the dates table variable
  4. Convert values in column [DurationInSeconds] into minutes
  5. Replace nulls with zero

Code:

DECLARE @dates TABLE ( ReportDates DATE )  
DECLARE @BeginDate AS DATE
  , @EndDate AS DATE
  , @RunDate AS DATE

SELECT @BeginDate = MIN(starttime) FROM dbo.track
SELECT @EndDate = MAX(starttime) FROM dbo.track

SET @RunDate = @BeginDate
WHILE @RunDate <= @EndDate
    BEGIN
        SET @RunDate = DATEADD(DAY, 1, @RunDate)
        INSERT  INTO @dates
        VALUES  ( @RunDate )
    END;

SELECT e.Email 
     , e.ReportDates
     , ISNULL(SUM(DurationInSeconds / 60), 0) AS TotDurationInMinutes
FROM (  SELECT  d.ReportDates
               ,t.email
        FROM    @dates AS d
        cross JOIN track AS t  
        GROUP BY d.ReportDates, t.Email ) AS e
LEFT JOIN track AS t ON e.ReportDates = CAST(t.StartTime AS DATE)
GROUP BY e.ReportDates, e.Email

Results:

Email ReportDates TotDurationInMinutes
----- ----------- ----------------------
ABC   2015-02-21  1439
ABC   2015-02-22  1357
ABC   2015-02-23  1969
ABC   2015-02-24  0
ABC   2015-02-25  0
ABC   2015-02-26  0
ABC   2015-02-27  0
ABC   2015-02-28  360
ABC   2015-03-01  0

3 Comments

thanks, I was expecting the values of TotDurationInMinutes column less than 1440 minutes because a day consist of 1440 minutes. I think that there is error in my example data and thats why value on 23 Feb is more than 1440. I will apply your query on my data to see how it behave. If all the values are below 1440 minutes then your answer will be accepted.
I think you forget to add a step of splitting the time on 12 am and 12 pm to put the time in respective days.
TotDurationInMinutes is just a conversion of the values you provided in your example, if it exceeds 1440 or not is about how accurate your data is and not the query. Your question does not states you want to split values in DurationInSeconds down at midnight. That is a completely different problem, and it does not apply to the original question. Edit your question to include such requirement.
1

you should group by the day value. you could get the day with the function DATEPART as in : DATEPART(d,[StartTime])

SELECT cast([StartTime] as date) as date ,sum(datediff(n,[StartTime],[EndTime])) as "min" 
FROM [test].[dbo].[track] 
group by DATEPART(d,[StartTime]),cast([StartTime]as date)

1 Comment

Hello and welcome to stackoverflow. While your contribution might answer the question, you should at least add some explanation, otherwise your answer will be less helpful for others.
0

hope it helps

SET NOCOUNT ON;

DROP TABLE #temp_table

CREATE TABLE #temp_table (
    Email VARCHAR(20)
    ,StartTime DATETIME
    ,DurationInSeconds INT
    ,
    )

DECLARE @Nextday DATETIME
    ,@Email VARCHAR(20)
    ,@StartTime DATETIME
    ,@DurationInSeconds INT
    ,@lastduration INT
    ,@currentduration INT
    ,@FirstDate DATETIME

SET @FirstDate = (
        SELECT TOP 1 LEFT(StartTime, 11)
        FROM track
        )

DECLARE vendor_cursor CURSOR
FOR
SELECT Email
    ,StartTime
    ,DurationInSeconds
FROM track

OPEN vendor_cursor

FETCH NEXT
FROM vendor_cursor
INTO @Email
    ,@StartTime
    ,@DurationInSeconds

WHILE @@FETCH_STATUS = 0
BEGIN
    IF EXISTS (
            SELECT 1
            FROM #temp_table
            WHERE LEFT(StartTime, 11) = LEFT(@StartTime, 11)
            )
    BEGIN
        SELECT @lastduration = DurationInSeconds
        FROM #temp_table
        WHERE LEFT(StartTime, 11) = LEFT(@StartTime, 11)

        SET @currentduration = @lastduration + @DurationInSeconds

        UPDATE #temp_table
        SET DurationInSeconds = @currentduration
        WHERE LEFT(StartTime, 11) = LEFT(@StartTime, 11)
    END
    ELSE
    BEGIN
        INSERT INTO #temp_table
        SELECT @Email
            ,@StartTime
            ,@DurationInSeconds

        SET @FirstDate = DATEADD(day, 1, @FirstDate)
    END

    IF NOT EXISTS (
            SELECT 1
            FROM track
            WHERE LEFT(StartTime, 11) = @FirstDate
            )
    BEGIN
        INSERT INTO #temp_table
        SELECT @Email
            ,@FirstDate
            ,0

        SET @FirstDate = DATEADD(day, 1, @FirstDate)
    END

    -- Get the next vendor.
    FETCH NEXT
    FROM vendor_cursor
    INTO @Email
        ,@StartTime
        ,@DurationInSeconds
END

CLOSE vendor_cursor;

DEALLOCATE vendor_cursor;

SELECT *
FROM #temp_table
ORDER BY StartTime

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.