postgres: using previous row value when current row value is null

Question

In my app I need to produce a data output (to be then converted to JSON) for the data in a postgres database (which is actually temperature data gathered by sensors).

The data is stored in device_history_log as follows (for data between 11am and noon):

row_id;deviceid;sensor_value_raw;last_update
104401;20865735;21.56;"2015-06-10 11:00:14"
104432;493417852;23.9;"2015-06-10 11:00:58"
104516;20865735;21.06;"2015-06-10 11:05:14"
104578;493417852;23.7;"2015-06-10 11:06:43"
104583;20865735;21.12;"2015-06-10 11:13:21"
104601;20865735;21.18;"2015-06-10 11:17:25"
104623;493417852;22.2;"2015-06-10 11:18:21"
104642;20865735;21.25;"2015-06-10 11:21:29"
104937;20865735;21.31;"2015-06-10 11:25:33"
105081;20865735;21.37;"2015-06-10 11:33:41"
105154;20865735;21.43;"2015-06-10 11:37:45"
105303;493417852;24;"2015-06-10 11:46:09"
105358;20865735;21.62;"2015-06-10 11:49:58"

I want to get 5 min averages for the data and so far I'm using this query:

SELECT grid.t5||'.000000' as ts, 
            CASE
                WHEN avg(t.sensor_value_raw) ISNULL THEN -1
                ELSE avg(t.sensor_value_raw) 
            END AS sensorvalue
FROM (

      SELECT date_trunc('min', time_series) as t5 
      FROM generate_series('2015-06-10 11:00:00'::timestamp, '2015-06-11 12:00:00', 
                   '5 min') as time_series

   ) grid
LEFT JOIN device_history_log t 
     ON t.last_update >= grid.t5 AND t.last_update <  grid.t5 +  interval '5 min' AND (t.deviceid = 493417852 or t.deviceid = 20865735)
GROUP  BY grid.t5
ORDER  BY grid.t5

The resulting data looks like this:

"2015-06-10 11:00:00.000000";23.8999996185303
"2015-06-10 11:05:00.000000";21.0599994659424
"2015-06-10 11:10:00.000000";21.1200008392334
"2015-06-10 11:15:00.000000";21.1800003051758
"2015-06-10 11:20:00.000000";21.25
"2015-06-10 11:25:00.000000";21.3099994659424
"2015-06-10 11:30:00.000000";21.3700008392334
"2015-06-10 11:35:00.000000";21.4300003051758
"2015-06-10 11:40:00.000000";-1
"2015-06-10 11:45:00.000000";22.8100004196167
"2015-06-10 11:50:00.000000";-1
"2015-06-10 11:55:00.000000";-1
"2015-06-10 12:00:00.000000";22.9250001907349

I need the averages as there are more than one sensors reporting back thus need to have an average value.

My question is if it is possible for the value of 10:40:00 instead of -1 to have the previous value as the temperature sensors only report a 'change' in the temperature so 'no data' means 'no change'.

Also for the first values that are -1 (thus no previous data) if possible I'd like them removed/ignored which I can do when parsing the data but if it can be done a query level even better.

Thus my ideal result set would be like this:

"2015-06-10 11:00:00.000000";23.8999996185303
"2015-06-10 11:05:00.000000";21.0599994659424
"2015-06-10 11:10:00.000000";21.1200008392334
"2015-06-10 11:15:00.000000";21.1800003051758
"2015-06-10 11:20:00.000000";21.25
"2015-06-10 11:25:00.000000";21.3099994659424
"2015-06-10 11:30:00.000000";21.3700008392334
"2015-06-10 11:35:00.000000";21.4300003051758
"2015-06-10 11:40:00.000000";21.4300003051758
"2015-06-10 11:45:00.000000";22.8100004196167
"2015-06-10 11:50:00.000000";22.8100004196167
"2015-06-10 11:55:00.000000";22.8100004196167
"2015-06-10 12:00:00.000000";22.9250001907349

Thank you!

It would be great if you supply a test case with CREATE TABLE and INSERT statements for sample data and a hand-crafted "expected results" table (stackoverflow.com/tags/postgresql/info) — Nicolai
– Nicolai, Commented Jun 15, 2015 at 14:56
possible duplicate of PostgreSQL use value from previous row if missing — Kristján
– Kristján, Commented Jun 15, 2015 at 15:02
@Kristján - I read that post but my data is aggregated with averages that makes it a bit complicated (for me at least)... — Kostas
– Kostas, Commented Jun 15, 2015 at 15:07

klin · Accepted Answer · 2015-06-15 20:09:56Z

2

The below query fills empty values in the resultset of your original query. The method consists in splitting the data into partitions according to the number of empty values and selecting the first (non-empty) value from each partition (add * to the select to see how it works).

WITH survey AS (

    SELECT grid.t5||'.000000' as ts, 
        CASE
            WHEN avg(t.sensor_value_raw) ISNULL THEN -1
            ELSE avg(t.sensor_value_raw) 
        END AS sensorvalue
    FROM (

          SELECT date_trunc('min', time_series) as t5 
          FROM generate_series('2015-06-10 11:00:00'::timestamp, '2015-06-11 12:00:00', 
                       '5 min') as time_series

       ) grid
    LEFT JOIN device_history_log t 
         ON t.last_update >= grid.t5 AND t.last_update <  grid.t5 +  interval '5 min' AND (t.deviceid = 493417852 or t.deviceid = 20865735)
    GROUP  BY grid.t5
    ORDER  BY grid.t5)

SELECT
    ts, first_value(sensorvalue) OVER (PARTITION BY part ORDER BY ts) sensorvalue
FROM (  
    SELECT *, sum((sensorvalue != -1)::int) OVER (ORDER BY ts) part
    FROM survey) alias

edited Jun 15, 2015 at 20:09

answered Jun 15, 2015 at 16:37

klin

123k15 gold badges241 silver badges263 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Kostas Over a year ago

Thanks @klin but I need the avg as there are more than one deviceid (temperature sensors) reporting back. I've updated the data to make it more explicit.

klin Over a year ago

@Kostas: see changed answer.

Collectives™ on Stack Overflow

postgres: using previous row value when current row value is null

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related