Sequence Numbering in SQL

Question

I have a list of activities(A-start to H-end) for certain events. These can occur in any order, any number of times and can restart as well. I need to identify the blocks of activities within an event.

Eg: A B C D E F G H B C D H C D E F H E F G H

It starts only once (A) but ends multiple times

Need to number these activities to identify sets (How many times it ended)

output: 1 1 1 1 1 1 1 2 2 2 2 3 3 3 3 3 4 4 4 4 5

This helps me identify that the event ended (H) 5-1 = 4 times

What is the database your are working on (MSSQL, Oracle, MYSQL, ...)? — etsa
– etsa, Commented Mar 16, 2017 at 10:22
Input data you wrote as example are different records or is a single string column? Similarly, the output can be different records or a single string column? — etsa
– etsa, Commented Mar 16, 2017 at 10:24
You need to have a column that specifies the ordering. SQL tables represent unordered sets. There is no inherent ordering of the values. I also assume that the values are actually on separate rows. — Gordon Linoff
– Gordon Linoff, Commented Mar 16, 2017 at 10:24
The activities (A - H) are records and output needs to be corresponding to the records — Neha
– Neha, Commented Mar 16, 2017 at 10:32

marcothesane · Accepted Answer · 2017-03-16 23:33:27Z

2

If your events are series of events in time - try to play with the MATCH() clause and its dependent functions event_name(), pattern_id() and match_id() .

I just created a timeseries out of your input letters, spaced by one-hour intervals, and applied a MATCH() clause. If the PATTERN pat AS () clause uncannily reminds you of a grep expression, that's the way it works.

Just look at the query's output - and imagine how many interesting things you could do with the pattern_id-s and the match_id-s that you get- grouping by them, for example, in subsequent SELECT-s ...

WITH 
s(tm,event) AS (
          SELECT TIME '00:00:00','A'
UNION ALL SELECT TIME '01:00:00','B'
UNION ALL SELECT TIME '02:00:00','C'
UNION ALL SELECT TIME '03:00:00','D'
UNION ALL SELECT TIME '04:00:00','E'
UNION ALL SELECT TIME '05:00:00','F'
UNION ALL SELECT TIME '06:00:00','G'
UNION ALL SELECT TIME '07:00:00','H'
UNION ALL SELECT TIME '08:00:00','B'
UNION ALL SELECT TIME '09:00:00','C'
UNION ALL SELECT TIME '10:00:00','D'
UNION ALL SELECT TIME '11:00:00','H'
UNION ALL SELECT TIME '12:00:00','C'
UNION ALL SELECT TIME '13:00:00','D'
UNION ALL SELECT TIME '14:00:00','E'
UNION ALL SELECT TIME '15:00:00','F'
UNION ALL SELECT TIME '16:00:00','H'
UNION ALL SELECT TIME '17:00:00','E'
UNION ALL SELECT TIME '18:00:00','F'
UNION ALL SELECT TIME '19:00:00','G'
UNION ALL SELECT TIME '20:00:00','H'
)
SELECT
  *
, event_name()
, pattern_id()
, match_id()
FROM s
MATCH(
  PARTITION BY 1 -- nothing to partition by
  ORDER BY tm
  DEFINE  
    START_ev AS (event='A')
  , any_ev   AS (event NOT IN ('A','H'))
  , END_ev   AS (event='H')
  PATTERN pat AS (start_ev* any_ev+ end_ev)
  ROWS MATCH FIRST EVENT
);

tm      |event|event_name|pattern_id|match_id
00:00:00|A    |START_ev  |         1|       1
01:00:00|B    |any_ev    |         1|       2
02:00:00|C    |any_ev    |         1|       3
03:00:00|D    |any_ev    |         1|       4
04:00:00|E    |any_ev    |         1|       5
05:00:00|F    |any_ev    |         1|       6
06:00:00|G    |any_ev    |         1|       7
07:00:00|H    |END_ev    |         1|       8
08:00:00|B    |any_ev    |         2|       1
09:00:00|C    |any_ev    |         2|       2
10:00:00|D    |any_ev    |         2|       3
11:00:00|H    |END_ev    |         2|       4
12:00:00|C    |any_ev    |         3|       1
13:00:00|D    |any_ev    |         3|       2
14:00:00|E    |any_ev    |         3|       3
15:00:00|F    |any_ev    |         3|       4
16:00:00|H    |END_ev    |         3|       5
17:00:00|E    |any_ev    |         4|       1
18:00:00|F    |any_ev    |         4|       2
19:00:00|G    |any_ev    |         4|       3
20:00:00|H    |END_ev    |         4|       4

answered Mar 16, 2017 at 23:33

marcothesane

6,8421 gold badge13 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Gordon Linoff Over a year ago

I have to admit this is a better solution for Vertica, assuming the MATCH() clause is optimized appropriately.

marcothesane Over a year ago

Hi Gordon - I recently began to use the MATCH() clause in car sensor data to identify suspicious patterns in the measure curves of air mass, temperature and fuel throughput. The plan created is just like any plan including OLAP functions ( LEAD() , LAG(), SUM() OVER() and the like ), and is therefore absolutely within the parameters of an OLAP query.

Gordon Linoff Over a year ago

. . That sounds like an interesting application.

Gordon Linoff · Accepted Answer · 2017-03-16 10:29:44Z

2

It looks like you want to count the number of "H"s and "A"s before a given value. This requires having a column that specifies the ordering. Let me assume this column is called id.

Then, you can do this with window functions:

select t.*,
       sum(case when col = 'H' then 1 else 0 end) over (partition by grp order by id) + 1 as output
from (select t.*,
             sum(case when col = 'A' then 1 else 0 end) over (order by id) as grp
      from t
     ) t;

The subquery defines the "activity" groups, by doing a cumulative sum of "A"s. The outer query then defines the "event" groups by doing a cumulative sum of "E"s.

To be honest, I cannot tell if the "H" is part of the preceding value or the next value. If the next value then the query can use a window clause or a slight tweak to the logic:

       (sum(case when col = 'H' then 1 else 0 end) over (partition by grp order by id) +
        (case when col = 'H' then 0 else 1 end)
       ) as output

edited Mar 16, 2017 at 10:29

answered Mar 16, 2017 at 10:24

Gordon Linoff

1.3m62 gold badges706 silver badges857 bronze badges

1 Comment

Neha Over a year ago

This helped. Thank you

Collectives™ on Stack Overflow

Sequence Numbering in SQL

2 Answers 2

3 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related