SQL QUERY replace NULL value in a row with a value from the previous known value

Question

I have 2 columns

date   number       
----   ------
1      3           
2      NULL        
3      5           
4      NULL        
5      NULL        
6      2          
.......

I need to replace the NULL values with new values takes on the value from the last known value in the previous date in the date column eg: date=2 number = 3, date 4 and 5 number = 5 and 5. The NULL values appear randomly.

Can you please identify the brand of SQL database you're using, e.g. MySQL, Oracle, SQL-Server, and edit your question to add that tag? — Bill Karwin
– Bill Karwin, Commented Aug 28, 2009 at 4:51
Do you mean the date in a previous row, or the previous date (day -1)? Perhaps give an example of the data that you have in the columns and an example of the output that you desire. — Degan
– Degan, Commented Aug 28, 2009 at 4:56
@Bill, he may want a generic solution. Some of us actually like the ability to switch easily between DBMS' when they become a PITA :-) Still, it's very hard (maybe impossible) to do this in standard SQL, @mike, so if you have a specific DBMS in mind, by all means let us know. — paxdiablo
– paxdiablo, Commented Aug 28, 2009 at 5:11
@Pax: I agree with the goal of a portable solution, but there are also benefits to using the most clear and efficient vendor-specific solution. — Bill Karwin
– Bill Karwin, Commented Aug 28, 2009 at 5:56
@paxdiablo All problems in computer science can be solved by another level of indirection. In this case, I think a triple self-join should work... if you really want, I could try to code it up. — Neil
– Neil, Commented May 9, 2012 at 21:15

Adriaan Stander · Accepted Answer · 2009-08-28 05:02:52Z

29

If you are using Sql Server this should work

DECLARE @Table TABLE(
        ID INT,
        Val INT
)

INSERT INTO @Table (ID,Val) SELECT 1, 3
INSERT INTO @Table (ID,Val) SELECT 2, NULL
INSERT INTO @Table (ID,Val) SELECT 3, 5
INSERT INTO @Table (ID,Val) SELECT 4, NULL
INSERT INTO @Table (ID,Val) SELECT 5, NULL
INSERT INTO @Table (ID,Val) SELECT 6, 2


SELECT  *,
        ISNULL(Val, (SELECT TOP 1 Val FROM @Table WHERE ID < t.ID AND Val IS NOT NULL ORDER BY ID DESC))
FROM    @Table t

answered Aug 28, 2009 at 5:02

Adriaan Stander

167k32 gold badges294 silver badges288 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Shraddha Over a year ago

Hi.. @Adriaan - how would it work when I have some values at week x dept level and I want to replace NULLs with last known value for a dept? So to this example, adding a 3rd dimension say ID,ID1,Val. Any pointers will be very much appreciated! thank you

Bill Karwin · Accepted Answer · 2009-08-28 06:46:33Z

24

Here's a MySQL solution:

UPDATE mytable
SET number = (@n := COALESCE(number, @n))
ORDER BY date;

This is concise, but won't necessary work in other brands of RDBMS. For other brands, there might be a brand-specific solution that is more relevant. That's why it's important to tell us the brand you're using.

It's nice to be vendor-independent, as @Pax commented, but failing that, it's also nice to use your chosen brand of database to its fullest advantage.

Explanation of the above query:

@n is a MySQL user variable. It starts out NULL, and is assigned a value on each row as the UPDATE runs through rows. Where number is non-NULL, @n is assigned the value of number. Where number is NULL, the COALESCE() defaults to the previous value of @n. In either case, this becomes the new value of the number column and the UPDATE proceeds to the next row. The @n variable retains its value from row to row, so subsequent rows get values that come from the prior row(s). The order of the UPDATE is predictable, because of MySQL's special use of ORDER BY with UPDATE (this is not standard SQL).

edited Aug 28, 2009 at 6:46

answered Aug 28, 2009 at 6:01

Bill Karwin

567k87 gold badges709 silver badges869 bronze badges

1 Comment

OMG Ponies Over a year ago

COALESCE is supported on SQL Server (2000?), and Oracle 9i+, but I don't get what the @n is doing.

voutmaster · Accepted Answer · 2010-10-06 20:27:16Z

14

The best solution is the one offered by Bill Karwin. I recently had to solve this in a relatively large resultset (1000 rows with 12 columns each needing this type of "show me last non-null value if this value is null on the current row") and using the update method with a top 1 select for the previous known value (or subquery with a top 1 ) ran super slow.

I am using SQL 2005 and the syntax for a variable replacement is slightly different than mysql:

UPDATE mytable 
SET 
    @n = COALESCE(number, @n),
    number = COALESCE(number, @n)
ORDER BY date

The first set statement updates the value of the variable @n to the current row's value of 'number' if the 'number' is not null (COALESCE returns the first non-null argument you pass into it) The second set statement updates the actual column value for 'number' to itself (if not null) or the variable @n (which always contains the last non NULL value encountered).

The beauty of this approach is that there are no additional resources expended on scanning the temporary table over and over again... The in-row update of @n takes care of tracking the last non-null value.

I don't have enough rep to vote his answer up, but someone should. It's the most elegant and best performant.

answered Oct 6, 2010 at 20:27

voutmaster

4513 silver badges6 bronze badges

2 Comments

Sussch Over a year ago

SQLite does not support variables. However, if the entries have an autoincrement id and have been inserted with date as a monotonic function, then each consecutive row could reference the previous. A very specific case, though. UPDATE mytable SET number = COALESCE(number, (SELECT t.number FROM mytable t WHERE mytable.id = t.id + 1));

Christopher J. Grace Over a year ago

SQL 2012 does not support an Order By. I had to go with Adriaan Stander's answer.

APC · Accepted Answer · 2019-02-19 21:50:55Z

12

Here is the Oracle solution (10g or higher). It uses the analytic function last_value() with the ignore nulls option, which substitutes the last non-null value for the column.

SQL> select *
  2  from mytable
  3  order by id
  4  /

        ID    SOMECOL
---------- ----------
         1          3
         2
         3          5
         4
         5
         6          2

6 rows selected.

SQL> select id
  2         , last_value(somecol ignore nulls) over (order by id) somecol
  3  from mytable
  4  /

        ID    SOMECOL
---------- ----------
         1          3
         2          3
         3          5
         4          5
         5          5
         6          2

6 rows selected.

SQL>

edited Feb 19, 2019 at 21:50

answered Aug 28, 2009 at 8:00

APC

147k20 gold badges180 silver badges289 bronze badges

4 Comments

OMG Ponies Over a year ago

Could you use LAG instead of LAST_VALUE? If so, that'd make it 8i+ compatible.

APC Over a year ago

No. LAG() only works with a fixed offset. The given test data has a variable offset.

vdolez Over a year ago

I came a long way to find this answer ! IGNORE NULLS did all the magic !

Oguen Over a year ago

ignore nulls shortened my query by 30% :)

APC · Accepted Answer · 2018-06-27 14:54:02Z

9

If you're looking for a solution for Redshift, this will work with the frame clause:

SELECT date, 
       last_value(columnName ignore nulls) 
                   over (order by date
                         rows between unbounded preceding and current row) as columnName 
 from tbl

edited Jun 27, 2018 at 14:54

APC

147k20 gold badges180 silver badges289 bronze badges

answered Sep 28, 2016 at 20:31

PieCharmed

1561 silver badge6 bronze badges

Comments

Gerardo Lima · Accepted Answer · 2012-05-18 11:14:59Z

8

The following script solves this problem and only uses plain ANSI SQL. I tested this solution on SQL2008, SQLite3 and Oracle11g.

CREATE TABLE test(mysequence INT, mynumber INT);

INSERT INTO test VALUES(1, 3);
INSERT INTO test VALUES(2, NULL);
INSERT INTO test VALUES(3, 5);
INSERT INTO test VALUES(4, NULL);
INSERT INTO test VALUES(5, NULL);
INSERT INTO test VALUES(6, 2);

SELECT t1.mysequence, t1.mynumber AS ORIGINAL
, (
    SELECT t2.mynumber
    FROM test t2
    WHERE t2.mysequence = (
        SELECT MAX(t3.mysequence)
        FROM test t3
        WHERE t3.mysequence <= t1.mysequence
        AND mynumber IS NOT NULL
       )
) AS CALCULATED
FROM test t1;

answered May 18, 2012 at 11:14

Gerardo Lima

6,7613 gold badges34 silver badges51 bronze badges

Comments

Cyrus Christ · Accepted Answer · 2012-03-29 05:45:08Z

I know it is a very old forum, but I came across this while troubleshooting my problem :) just realised that the other guys have given bit complex solution to the above problem. Please see my solution below:

DECLARE @A TABLE(ID INT, Val INT)

INSERT INTO @A(ID,Val) SELECT 1, 3
INSERT INTO @A(ID,Val) SELECT 2, NULL
INSERT INTO @A(ID,Val) SELECT 3, 5
INSERT INTO @A(ID,Val) SELECT 4, NULL
INSERT INTO @A(ID,Val) SELECT 5, NULL
INSERT INTO @A(ID,Val) SELECT 6, 2

UPDATE D
    SET D.VAL = E.VAL
    FROM (SELECT A.ID C_ID, MAX(B.ID) P_ID
          FROM  @A AS A
           JOIN @A AS B ON A.ID > B.ID
          WHERE A.Val IS NULL
            AND B.Val IS NOT NULL
          GROUP BY A.ID) AS C
    JOIN @A AS D ON C.C_ID = D.ID
    JOIN @A AS E ON C.P_ID = E.ID

SELECT * FROM @A

Hope this may help someone:)

van · Accepted Answer · 2009-08-28 09:01:40Z

First of all, do you really need to store the values? You may just use the view that does the job:

SELECT  t."date",
        x."number" AS "number"
FROM    @Table t
JOIN    @Table x
    ON  x."date" = (SELECT  TOP 1 z."date"
                    FROM    @Table z
                    WHERE   z."date" <= t."date"
                        AND z."number" IS NOT NULL
                    ORDER BY z."date" DESC)

If you really do have the ID ("date") column and it is a primary key (clustered), then this query should be pretty fast. But check the query plan: it might be better to have a cover index including the Val column as well.

Also if you do not like procedures when you can avoid them, you can also use similar query for UPDATE:

UPDATE  t
SET     t."number" = x."number"
FROM    @Table t
JOIN    @Table x
    ON  x."date" = (SELECT  TOP 1 z."date"
                    FROM    @Table z
                    WHERE   z."date" < t."date" --//@note: < and not <= here, as = not required
                        AND z."number" IS NOT NULL
                    ORDER BY z."date" DESC)
WHERE   t."number" IS NULL

NOTE: the code must works on "SQL Server".

Pete Carter · Accepted Answer · 2012-10-19 21:20:51Z

1

This is the solution for MS Access.

The example table is called tab, with fields id and val.

SELECT (SELECT last(val)
          FROM tab AS temp
          WHERE tab.id >= temp.id AND temp.val IS NOT NULL) AS val2, *
  FROM tab;

edited Oct 19, 2012 at 21:20

Pete Carter

2,7313 gold badges26 silver badges35 bronze badges

answered May 9, 2012 at 8:10

George

111 bronze badge

Comments

Allen · Accepted Answer · 2020-10-03 23:46:58Z

1

This will work on Snowflake (credit to Darren Gardner):

create temp table ss (id int, val int);
insert into ss (id,val) select 1, 3;
insert into ss (id,val) select 2, null;
insert into ss (id,val) select 3, 5;
insert into ss (id,val) select 4, null;
insert into ss (id,val) select 5, null;
insert into ss (id,val) select 6, 2;

select *
      ,last_value(val ignore nulls) over 
       (order by id rows between unbounded preceding and current row) as val2
  from ss;

answered Oct 3, 2020 at 23:46

Allen

4262 silver badges8 bronze badges

Comments

OMG Ponies · Accepted Answer · 2009-08-28 05:59:51Z

0

UPDATE TABLE
   SET number = (SELECT MAX(t.number)
                  FROM TABLE t
                 WHERE t.number IS NOT NULL
                   AND t.date < date)
 WHERE number IS NULL

edited Aug 28, 2009 at 5:59

answered Aug 28, 2009 at 5:27

OMG Ponies

334k85 gold badges536 silver badges508 bronze badges

2 Comments

SquareCog Over a year ago

max(t.value) doesn't work -- you want the value with the max id < myid, not the max value. Have to order by id desc, then take top 1 (using top() or limit or rownum.. depends on specific db)

OMG Ponies Over a year ago

@SquareCog: Re-read the OP: ...replace the NULL [number column] values with new values taken from the last known [number column] value in the previous date in the date column eg: date=2 number = 3, date 4 and 5 number = 5 and 5.

Raymond Reddington · Accepted Answer · 2020-01-22 12:48:32Z

0

In case you have one identity (Id) and one common (Type) columns:

UPDATE #Table1 
SET [Type] = (SELECT TOP 1 [Type]
              FROM #Table1 t              
              WHERE t.[Type] IS NOT NULL AND 
              b.[Id] > t.[Id]
              ORDER BY t.[Id] DESC)
FROM #Table1 b
WHERE b.[Type] IS NULL

answered Jan 22, 2020 at 12:48

Raymond Reddington

1,8471 gold badge15 silver badges21 bronze badges

Comments

FelixSFD · Accepted Answer · 2016-12-30 14:18:39Z

-6

Try this:

update Projects
set KickOffStatus=2 
where KickOffStatus is null

edited Dec 30, 2016 at 14:18

FelixSFD

6,10110 gold badges46 silver badges134 bronze badges

answered Dec 30, 2016 at 13:15

Muhammad Ali

11 bronze badge

Collectives™ on Stack Overflow

SQL QUERY replace NULL value in a row with a value from the previous known value

13 Answers 13

1 Comment

1 Comment

2 Comments

4 Comments

Comments

Comments

Comments

Comments

Comments

Comments

2 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

13 Answers 13

1 Comment

1 Comment

2 Comments

4 Comments

Comments

Comments

Comments

Comments

Comments

Comments

2 Comments

Comments

Comments

Linked

Related