5

Sorry for bad topic but I wasn't sure what to call it..

I have a table looking like this:

+-----++-----+
| Id  ||Count|
+-----++-----+
| 1   || 1   |
+-----++-----+
| 2   || 5   |
+-----++-----+
| 3   || 8   |
+-----++-----+
| 4   || 3   |
+-----++-----+
| 5   || 6   |
+-----++-----+
| 6   || 8   |
+-----++-----+
| 7   || 3   |
+-----++-----+
| 8   || 1   |
+-----++-----+

I'm trying to make a select from this table where every time the SUM of row1 + row2 + row3 (etc) reaches 10, then it's a "HIT", and the count starts over again.

Requested output:

+-----++-----++-----+
| Id  ||Count|| HIT |
+-----++-----++-----+
| 1   || 1   ||  N  | Count = 1
+-----++-----++-----+
| 2   || 5   ||  N  | Count = 6
+-----++-----++-----+
| 3   || 8   ||  Y  | Count = 14 (over 10)
+-----++-----++-----+
| 4   || 3   ||  N  | Count = 3
+-----++-----++-----+
| 5   || 6   ||  N  | Count = 9
+-----++-----++-----+
| 6   || 8   ||  Y  | Count = 17 (over 10..)
+-----++-----++-----+
| 7   || 3   ||  N  | Count = 3
+-----++-----++-----+
| 8   || 1   ||  N  | Count = 4
+-----++-----++-----+

How do I do this, and with best performance? I have no idea..

6
  • look here Commented Oct 8, 2015 at 13:22
  • Interesting thought about using dense_rank(). Perhaps this would work. Otherwise I fear a stored proc with a cursor is probably what is needed. Commented Oct 8, 2015 at 13:27
  • If dense_rank() doesn't do the trick, you could do this with a recursive view as well. I believe that dense_rank() route would be better optimized though. Commented Oct 8, 2015 at 13:28
  • The table is quite large..I will try and see If I can get it working with dense_rank Commented Oct 8, 2015 at 13:32
  • Agreed 100%. If the table is in the thousands, then no biggie, but if it's larger then recursive would be a poor performer here. Commented Oct 8, 2015 at 13:33

4 Answers 4

5

You can't do this using window/analytic functions, because the breakpoints are not known in advance. Sometimes, it is possible to calculate the breakpoints. However, in this case, the breakpoints depend on a non-linear function of the previous values (I can't think of a better word than "non-linear" right now). That is, sometimes adding "1" to an earlier value has zero effect on the calculation for the current row. Sometimes it has a big effect. The implication is that the calculation has to start at the beginning and iterate through the data.

A minor modification to the problem would be solvable using such functions. If the problem were, instead, to carry over the excess amount for each group (instead of restarting the sum), the problem would be solvable using cumulative sums (and some other trickery).

Recursive queries (which others have provided) or a sequential operation is the best way to approach this problem. Unfortunately, it doesn't have a set-based method for solving it.

Sign up to request clarification or add additional context in comments.

Comments

3

You could use Recursive Queries

Please note the following query assuming the id value are all in sequence, otherwise, please use ROW_NUMBER() to create a new id

WITH cte AS (
  SELECT id, [Count], [Count] AS total_count
  FROM Table1 
  WHERE id = 1
  UNION ALL
  SELECT t2.id,t2.[Count], CASE WHEN t1.total_count >= 10 THEN t2.[Count] ELSE t1.total_count + t2.[Count] END
  FROM Table1 t2
  INNER JOIN cte t1 
    ON t2.id = t1.id + 1
  )
SELECT *
FROM cte
ORDER BY id

SQL Fiddle

Comments

1

I'm really hoping someone can show us if it's possible to do this using straight-forward window functions. That's the real challenge.

In the meantime, here is how I would do it using recursion. This handles gaps in the sequence, and handles the edge case of the first row already being >= 10.

I also added the maxrecursion hint to remove the default recursion limit. But I honestly don't know how well it will run with larger amounts of data.

with NumberedRows as (
  select Id, Cnt,
         row_number() over(order by id) as rn
    from CountTable
), RecursiveCTE as (
  select Id, Cnt, rn, 
         case when Cnt >= 10 then 0 else Cnt end as CumulativeSum,
         case when Cnt >= 10 then 'Y' else 'N' end as hit
    from NumberedRows
   where rn = 1
  union all
  select n.Id, n.Cnt, n.rn,
         case when (n.Cnt + r.CumulativeSum) >= 10 then 0 else n.Cnt + r.CumulativeSum end as CumulativeSum,
         case when (n.Cnt + r.CumulativeSum) >= 10 then 'Y' else 'N' end as hit
    from RecursiveCTE r
    join NumberedRows n
      on n.rn = r.rn + 1
)
select Id, Cnt, hit
from RecursiveCTE
order by Id
option (maxrecursion 0)

SQLFiddle Demo

Comments

1

How about this using Running Totals:

DECLARE @Data TABLE(
    Id INT
    ,SubTotal INT
)


INSERT INTO @Data
    VALUES(1, 5)

INSERT INTO @Data
    VALUES(2, 3)

INSERT INTO @Data
    VALUES(3, 4)

INSERT INTO @Data
    VALUES(4, 4)

INSERT INTO @Data
    VALUES(5, 7)

DECLARE @RunningTotal INT = 0
DECLARE @HitCount INT = 0    

SELECT  
        @RunningTotal = CASE WHEN @RunningTotal < 10 THEN @RunningTotal + SubTotal ELSE SubTotal END
        ,@HitCount = @HitCount + CASE WHEN @RunningTotal >= 10 THEN 1 ELSE 0 END
        FROM @Data ORDER BY Id

SELECT @HitCount -- Outputs 2

Having re-read the question I see this does not meet the required output - I'll leave the answer as it may be of some use to someone looking for an example of a running total solution to this type of problem that doesn't need each row tagged with a Y or an N.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.