Using isnull() in a pandas data frame to check a particular value is null or not

Question

I am editing my previous question as it was flawed. I have a data frame named df. In that data frame, columns contain values, some of them are negative values, zeros, and NaN. I want to replace these values and store a respective value of the flag in another data frame at the respective index.

df = pd.read_excel('Check.xlsx')
df_ph_temp = df.iloc[:,2:5]
df_flags = pd.DataFrame(index=df.index, columns=df.columns)
flag_ph_temp = df_flags.iloc[:,2:5]
for rowIndex, row in df_ph_temp.iterrows() :
    for colIndex, value in row.items() :
        if value == 0 :
            df_ph_temp.loc[rowIndex, colIndex] = df_ph_temp.loc[rowIndex - 1, colIndex]
            flag_ph_temp.loc[rowIndex, colIndex] = 1            
        elif value < 0 :
            df_ph_temp.loc[rowIndex, colIndex] = 0
            flag_ph_temp.loc[rowIndex, colIndex] = 1
        elif value > 200 :
            df_ph_temp.loc[rowIndex, colIndex] = 130
            flag_ph_temp.loc[rowIndex, colIndex] = 2
        elif value == np.nan : # Not working... Why?
            df_ph_temp.loc[rowIndex, colIndex] = df_ph_temp.loc[rowIndex - 1, colIndex]
            flag_ph_temp.loc[rowIndex, colIndex] = 1            
        else :
            continue

I am not getting any errors but also not getting desired output. Replacing NaN values and storing the resp. flag values in the flag's data frame, this part of the program is not working. I think this is because data contains more than 2 lines with NaN values. Is there a way to fix this? I tried

df_ph_temp[colIndex].fillna(method ='ffill', inplace = True)

before the if condition but still not able to achieve desired results.

I am unable to figure it out. Kindly help.

Léo Beaucourt · Accepted Answer · 2022-04-12 12:03:52Z

2

Using pandas, you should avoid loop. Use mask filtering and slicing to fill your flag column. In order to detect null values, use .isnull() directly on pandas dataframe or series (when you select a column), not on a value as you did. Then use .fillna() if you want to replace null values with something else.

Based on your code (but not sure it will works, it could be helpfull you share some input data and expected output), the solution may look as follow.

First create empty column as you did:

data['Flags'] = None

Then fill this columns based on condition on "Temperature phase" column (using fillna(0) to replace all null values by 0 allow you to only test if values are <= 0, this replacement is not applied on the final dataframe):

data.loc[data['Temperature phase'].fillna(0) <= 0, "Flags"] = 1
data.loc[data['Temperature phase'] > 200, "Flags"] = 2

And now replace Temperature phase values.

For the values equal to 0 or null, you seems to have choosen to replace them with the previous value in dataframe. You maybe could achieve this part using this.

data.loc[data['Temperature phase'].isnull(), 'Temperature phase'] = data['Temperature phase'].shift().loc[data.loc[data['Temperature phase'].isnull()].index]

First, this command use .shift() to shift all values in column Temperature phase by one, then filtering rows where Temperature phase is null and replace values by corresponding index in shifted Temperature phase values.

Finaly, replace other Temperature phase values:

data.loc[data['Temperature phase'] < 0, "Temperature phase"] = 0
data.loc[data['Temperature phase'] > 200, "Temperature phase"] = 130

You don't need flag index so on as the Flag is directly fill in the final dataframe.

answered Apr 12, 2022 at 12:03

Léo Beaucourt

2822 silver badges6 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Shraddha Jadhav Over a year ago

Thank you for your response. I will check this.

Léo Beaucourt Over a year ago

You're welcome. Don't hesitate to validate the response if it fit your needs to close the question.

Shraddha Jadhav Over a year ago

Your answer was useful however I have edited my question. Your answer helped me to understand mask filtering thank you once again.

Léo Beaucourt Over a year ago

You really need to avoid loop by using pandas masks filtering instead. Looping over a dataframe lead to very bad performances. However, your condition value == np.nan may not work because your are dealing with None and not np.nan with are to distinct objects. Use the pandas .isnull() to get None and NaN values.

Shraddha Jadhav Over a year ago

Yes, I removed loops and was able to achieve the desired results. Just a follow-up question how can we combine multiple conditions to store the flag values in the resp column. I did following but getting an Value Error df_flags.loc[(df[col].fillna(0) > 0 and df[col].fillna(0) <= 10), col] = 0 df_flags.loc[(df[col].fillna(0) > 25 and df[col].fillna(0) <= 50), col] = 0.5 df_flags.loc[(df[col].fillna(0) > 50 and df[col].fillna(0) <= 150), col] = 1

|

Collectives™ on Stack Overflow

Using isnull() in a pandas data frame to check a particular value is null or not

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related