3

I have this data set in PostgreSQL:

    User_id    |            date         | is_cleared |
---------------+-------------------------+------------+
             C | 2016-08-03 18:28:38.37  |          1 |
             A | 2016-08-01 19:59:36.548 |          0 |
             A | 2016-08-01 23:38:29.912 |          0 |
             A | 2016-08-04 19:03:30.585 |          0 |
             A | 2016-08-08 19:53:20.251 |          0 |
             A | 2016-08-11 21:00:16.075 |          0 |
             A | 2016-08-13 21:07:01.336 |          0 |
             A | 2016-08-15 10:17:32.746 |          0 |
             A | 2016-08-17 11:54:16.75  |          0 |
             A | 2016-08-18 21:12:22.717 |          0 |
             A | 2016-08-21 19:59:26.441 |          0 |
             A | 2016-08-22 17:19:42.215 |          0 |
             A | 2016-08-24 20:18:45.516 |          1 |
             A | 2016-08-27 13:22:33.98  |          1 |
             A | 2016-08-31 16:09:25.756 |          0 |
             B | 2016-08-03 21:10:32.945 |          1 |
             B | 2016-08-05 20:40:55.644 |          1 |
             B | 2016-08-06 10:59:00.796 |          1 |
             B | 2016-08-09 22:36:50.674 |          0 |
             B | 2016-08-11 09:37:21.29  |          1 |
             B | 2016-08-13 10:15:15.858 |          1 |
             B | 2016-08-26 09:56:48.937 |          1 |
             B | 2016-08-29 09:34:13.96  |          1 |

And I need an output like this:

    User_id    |            date         | is_cleared |  custom_rank   
---------------+-------------------------+------------+---------
             C | 2016-08-03 18:28:38.37  |          1 |  1
             A | 2016-08-01 19:59:36.548 |          0 |  1
             A | 2016-08-01 23:38:29.912 |          0 |  2
             A | 2016-08-04 19:03:30.585 |          0 |  3
             A | 2016-08-08 19:53:20.251 |          0 |  4
             A | 2016-08-11 21:00:16.075 |          0 |  5
             A | 2016-08-13 21:07:01.336 |          0 |  6
             A | 2016-08-15 10:17:32.746 |          0 |  7
             A | 2016-08-17 11:54:16.75  |          0 |  8
             A | 2016-08-18 21:12:22.717 |          0 |  9
             A | 2016-08-21 19:59:26.441 |          0 |  10
             A | 2016-08-22 17:19:42.215 |          0 |  11
             A | 2016-08-24 20:18:45.516 |          1 |  12
             A | 2016-08-27 13:22:33.98  |          1 |  1
             A | 2016-08-31 16:09:25.756 |          0 |  1
             B | 2016-08-03 21:10:32.945 |          1 |  1
             B | 2016-08-05 20:40:55.644 |          1 |  1
             B | 2016-08-06 10:59:00.796 |          1 |  1
             B | 2016-08-09 22:36:50.674 |          0 |  1
             B | 2016-08-11 09:37:21.29  |          1 |  2
             B | 2016-08-13 10:15:15.858 |          1 |  1
             B | 2016-08-26 09:56:48.937 |          1 |  1
             B | 2016-08-29 09:34:13.96  |          1 |  1

Basically my goal is to count the number of zeros including the 1st occurrence of "1" after the zeros in the is_cleared flag.

The data is sorted by date and should also be partitioned by user_id.

0

1 Answer 1

2

Window functions and a single subquery:

SELECT user_id, date, is_cleared
     , count(*) OVER (PARTITION BY user_id, grp ORDER BY date) AS custom_rank
FROM  (
   SELECT user_id, date, is_cleared
        , count(is_cleared = 1 OR NULL) OVER (PARTITION BY user_id ORDER BY date DESC) AS grp
   FROM   tbl
   ) t
ORDER  BY user_id, date;

The special difficulty here: each partition ends with 1. In most similar cases, each partition would start with 1. The simple solution is to count occurrences of 1 in descending order. This way each partition includes the next 1 if it exists. Voilá.

Assuming all involved columns to be NOT NULL and (user_id, date) unique.

Basics:

Asides:

  • The first row of B gets custom_rank = 1. Not 2.
  • The column is_cleared should really be boolean.
  • A timestamp column shouldn't be called "date".
2
  • count(is_cleared = 1 OR NULL) approach is the new thing I learned today! thank you again! Commented Sep 14, 2016 at 1:56
  • 1
    @Kugutsumen: A comprehensive overview of related techniques: dba.stackexchange.com/questions/27558/… Commented Sep 14, 2016 at 2:00

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.