0
import pandas as pd
import numpy as np

data = np.array([['', 'Col1', 'Col2', 'Col3'],
                 ['Row1', 1, 2, 3],
                 ['Row2', np.nan, 5, 6],
                 ['Row3', 7, 8, 9]
                 ])

df = pd.DataFrame(data=data[1:, 1:],
                  index=data[1:,0],
                  columns=data[0,1:])


OutPut:
     Col1 Col2 Col3
Row1    1    2    3
Row2  nan    5    6
Row3    7    8    9

I would like to loop through the dataframe and replace the NaN value in Row2['Col1'] (current row in loop) with the value in Row1['Col3'] (different column from the previous record in loop)

3 Answers 3

2

One way you can do this is to use stack, ffill, and unstack:

df.stack(dropna=False).ffill().unstack()

Output:

     Col1 Col2 Col3
Row1    1    2    3
Row2    3    5    6
Row3    7    8    9
Sign up to request clarification or add additional context in comments.

Comments

0

You have one more thing need to solve before replace nan:

1st: You are using array , array do not accept join type , which mean your nan here is not np.nan any more, it is 'nan'

df.applymap(type)
Out[1244]: 
               Col1           Col2           Col3
Row1  <class 'str'>  <class 'str'>  <class 'str'>
Row2  <class 'str'>  <class 'str'>  <class 'str'>
Row3  <class 'str'>  <class 'str'>  <class 'str'>

df=df.replace('nan',np.nan)

2nd, I am using np.roll + combine_first to fill the nan

df.combine_first(pd.DataFrame(np.roll(np.concatenate(df.values),1).reshape(3,3),index=df.index,columns=df.columns))
Out[1240]: 
     Col1 Col2 Col3
Row1    1    2    3
Row2    3    5    6
Row3    7    8    9

Comments

0

I apologize for not posting the actual data from my dataset so here it is:

             Open   High    Low   Last  Change  Settle   Volume  
Date                                                              
2017-05-22  51.97  52.28  51.73  **51.96**    0.49   52.05  70581.0   
2017-05-23    **NaN**  52.44  51.61  52.31    0.24   52.35   9003.0   
2017-05-24  52.34  52.63  51.91  52.05    0.23   52.12  11678.0   
2017-05-25  52.25  52.61  49.49  49.59    2.28   49.84  19721.0   
2017-05-26  49.82  50.73  49.34  50.73    0.82   50.66  11214.0 

I needed the script to find any 'NaN's in the 'Open' column and replace it with the 'Last' from the previous row.(highlighted here by double asterisks).

I thank all for the posts, however, this is what ended up working:

missing = df['Open'].isnull() # get nans
new_open = df['Open'].copy() # make copy

# loop missing and test against a True value
# if so, get the 'Last' value at index and
# populate new_open value at index
for i in range(missing.shape[0]):
    if missing[i] == True:
        new_open.iloc[i] = df['Last'][i-1]

# replace the 'Open' values with new 'Open' values
df['Open'] = new_open

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.