1

I would like to fill NaN values with the next value in the column number for df:

        Id  Date        is_start        number
151256  30  2010-09-21  False           NaN
237558  30  2010-09-22  False           0.0
36922   120 2010-10-13  False           0.0
246284  80  2010-09-21  False           NaN
47655   80  2010-09-21  False           658.0

reproducible example:

import pandas as pd
import numpy as np
import datetime

sample_df = pd.DataFrame({'Id': {151256: 30, 237558: 30, 36922: 120, 246284: 80, 47655: 80},
 'Date': {151256: datetime.date(2010, 9, 21),
  237558: datetime.date(2010, 9, 22),
  36922: datetime.date(2010, 10, 13),
  246284: datetime.date(2010, 9, 21),
  47655: datetime.date(2010, 9, 21)},
 'is_start': {151256: False,
  237558: False,
  36922: False,
  246284: False,
  47655: False},
 'number': {151256: np.nan,
  237558: 0.0,
  36922: 0.0,
  246284: np.nan,
  47655: 658.0}})
sample_df

Expected output:

        Id  Date        is_start        number
151256  30  2010-09-21  False           0.0   (replaced)
237558  30  2010-09-22  False           0.0
36922   120 2010-10-13  False           0.0
246284  80  2010-09-21  False           658.0 (replaced)
47655   80  2010-09-21  False           658.0

I tried:

sample_df['number'] = sample_df.fillna(sample_df.number.shift())

but got output:

        Id  Date    is_start    number
151256  30  2010-09-21  False   30
237558  30  2010-09-22  False   30
36922   120 2010-10-13  False   120
246284  80  2010-09-21  False   80
47655   80  2010-09-21  False   80

where number took on values in the Id column. Why is this happening and what is the correct way?

1
  • 1
    you for got number between sample_df and .fillna, so it should be sample_df.number.fillna Commented Oct 29, 2020 at 1:54

2 Answers 2

1

Check bfill notice limit here is to only fill the next one NaN value

df.number = df.number.bfill(limit=1)
Out[138]: 
151256      0.0
237558      0.0
36922       0.0
246284    658.0
47655     658.0
Name: number, dtype: float64
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for the answer. Could you please tell me what went wrong in my previous code?
@nilsinelabore you can not fillna with dataframe , it should serise sample_df.number.fillna(sample_df.number.shift())
1

BEN_YO's solution is the answer, but here is an alternative with fillna and shift(-1):

sample_df['number'] = sample_df['number'].fillna(sample_df['number'].shift(-1))
sample_df
Out[1]: 
         Id        Date  is_start  number
151256   30  2010-09-21     False     0.0
237558   30  2010-09-22     False     0.0
36922   120  2010-10-13     False     0.0
246284   80  2010-09-21     False   658.0
47655    80  2010-09-21     False   658.0

3 Comments

Do you know why it seems like number is taking on values in the Id column in my previous code?
@nilsinelabore sorry, I do not follow.
@nilsinelabore You wrote: sample_df['number'] = sample_df.fillna(sample_df.number.shift()) instead of sample_df['number'] = sample_df.number.fillna(sample_df.number.shift())

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.