0

I am relatively new to PostgreSQL and am having a difficult time trying to implement the lag() function while ignoring null values. According to the docs, http://www.postgresql.org/docs/9.4/static/functions-window.html, it is not possible to simply pass a parameter to the function to achieve this.

Does anyone know of any workaround? You can make up any example but if it helps here is something to get you started:

Col_A   Col_B  Col_A_Lag  Col_B_Lag  ID  Date
VA_1    100    VA_1       100        AA  Jan-1 
null    null   VA_1       100        AA  Jan-2
null    null   VA_1       100        AA  Jan-3
VA_2    250    VA_2       250        AA  Jan-4
VA_2    300    VA_2       300        AA  Jan-5  
null    null   VA_2       300        AA  Jan-6

If I am pulling from tbl_x, here is a simple SQL script:

select
Col_A,
Col_B,
case when Col_A is null then lag(Col_A) over w else Col_A end as Col_A_Lag,
case when Col_B is null then lag(Col_B) over w else Col_B end as Col_B_Lag
from tbl_x
window w as (partition by ID order by Date)

This script will not return what I want because it includes the null values when it 'looks back' in the lag.

Thanks in advance.

1 Answer 1

3

I suppose that you can't just:

select
from tbl_x
window w as (partition by ID order by Date)
where col_a is null;

If not then you might have to:

select
Col_A,
Col_B,
case when Col_A is null
  then (select col_a
          from tbl_x x2
         where x2.id = x1.id
           and col_a is not null
           and x2.date < x1.date
      order by date desc
         limit 1)
  else Col_A
  end Col_A_Lag,
case when Col_B is null
  then (select col_b
          from tbl_x x2
         where x2.id = x1.id
           and col_b is not null
           and x2.date < x1.date
      order by date desc
         limit 1)
  else Col_B
  end Col_B_Lag
from tbl_x x1;

With appropriate indexing the performance would probably be pretty good.

create index ... (id, date desc, col_b) where col_b is not null;
Sign up to request clarification or add additional context in comments.

3 Comments

I appreciate your answer but your second solution does not work when Col_A changes. I either get all VA_1 values or VA_2 values in Col_A_Lag depending on how I sort 'date'.
Ah didn't spot that bit of logic. Edited the query to include it.
Closer... all null values are filled in with VA_1.... the last should be VA_2. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.