18

Lets say I have the following table:

 | User_id |   COL1   | COL2 |
 +---------+----------+------+
 | 1       |          | 1    |
 | 1       |          | 2    | 
 | 1       |   2421   |      | 
 | 1       |          | 1    | 
 | 1       |   3542   |      | 
 | 2       |          | 1    |

I need another column indicating the next non-null COL1 value for each row, so the result would look like the below:

 | User_id |   COL1   | COL2 | COL3 |
 +---------+----------+------+------
 | 1       |          | 1    | 2421 |
 | 1       |          | 2    | 2421 |
 | 1       |   2421   |      |      |
 | 1       |          | 1    | 3542 |
 | 1       |   3542   |      |      |
 | 2       |          | 1    |      |

SELECT 
first_value(COL1 ignore nulls) over (partition by user_id order by COL2 rows unbounded following) 
FROM table;

would work but I'm using PostgreSQL which doesn't support the ignore nulls clause.

Any suggested workarounds?

1
  • 2
    You need a column to specify the ordering. SQL tables are inherently unordered. Commented May 26, 2016 at 21:14

6 Answers 6

22

You can still do it with windowing function if you add a case when criteria in the order by like this:

select
   first_value(COL1) 
   over (
     partition by user_id 
     order by case when COL1 is not null then 0 else 1 end ASC, COL2 
     rows unbounded following
   ) 
from table

This will use non null values first.

However performance will probably not be great compared to skip nulls because the database will have to sort on the additional criteria.

Sign up to request clarification or add additional context in comments.

3 Comments

But that's not really the same thing as the IGNORE NULLS clause.
A clause that postgresql does not support atm
This would not work if you are looking for the non-null first value after the current row. It will take the first non-null value for this user_id no matter of the position of the current row.
9

I also had the same problem. The other solutions may work, but I have to build multiple windows for each row I need.

You can try this snippets : https://wiki.postgresql.org/wiki/First/last_(aggregate)

If you create the aggregates you can use them:

SELECT 
first(COL1) over (partition by user_id order by COL2 rows unbounded following) 
FROM table;

Comments

3

There is always the tried and true approach of using a correlated subquery:

select t.*,
       (select t2.col1
        from t t2
        where t2.id >= t.id and t2.col1 is not null
        order by t2.id desc
        fetch first 1 row only
       ) as nextcol1
from t;

4 Comments

The t.id in the t2.id >= t.id filter isn't being found when I run this
@user3558238 . . . What do you mean it isn't being found? t is the alias of the table in the outer query; t2 is the alias in the inner query.
it's saying the t.user_id does not exist, perhaps subqueries can't refer to outer query parameters in PostgreSQL?
@user3558238 . . . Postgres definitely supports correlated subqueries. You should edit your question and include your attempt.
2
  1. Aggregate functions can be used as window functions
  2. There's an aggregate filter clause.
  3. Window spec can tell it to order most recent first. Get them into an array:
select (array_agg(COL1)filter(where COL1 is not null)over w1)[1]
from cte
window w1 as (order by d desc rows 
              between current row and unbounded following);
  1. You pop the array[1].

1 Comment

Nice thing is, it's not just skip nulls, it's anything you can express in the where and it combines lag, lead, first, last and nth_value being able to target any position, which can also be an expression switching it dynamically. Looks and performance aside, one downside might be that if you want a negative subscript (supported for json arrays but not for the regular ones), you need to flip the window, or aggregate to json array, or bounce off of upper bound: (array_agg()over w1)[count()over w1 -3] or repeat array_agg() and take its array_upper().
-2

Hope this helps,

SELECT * FROM TABLE ORDER BY COALESCE(colA, colB);

which orders by colA and if colA has NULL value it orders by colB.

Comments

-3

You can use COALESCE() function. For your query:

SELECT 
first_value(COALESCE(COL1)) over (partition by user_id order by COL2 rows unbounded following) 
FROM table;

but i don't understand what the reason to use sort by COL2, because this rows has null value for COL2:

 | User_id |   COL1   | COL2 |
 +---------+----------+------+
 | 1       |          | 1    |
 | 1       |          | 2    | 
 | 1       |   2421   |      | <<--- null?
 | 1       |          | 1    | 
 | 1       |   3542   |      | <<--- null?
 | 2       |          | 1    |

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.