How to replace values in a pandas dataframe column based on a condition

Question

I have a dataframe that looks like this

import pandas as pd
    
all_df = pd.DataFrame(
    [
        ('A','Area1','AA','A B D E','A B','D E'),
        ('B','Area1','AA','A B D E','A B','D E'),
        ('C','Area2','BB','C','C','C'),
        ('E','Area1','CC','A B D E','A B','D E'),
        ('F','Area3','BB','F G','G','F')
    ],
    columns = ['Name','Area','Type','Group','AA members','CC members']
)

    
 Name   Area Type    Group AA members CC members
0    A  Area1   AA  A B D E        A B        D E
1    B  Area1   AA  A B D E        A B        D E
2    C  Area2   BB        C          C          C
3    E  Area1   CC  A B D E        A B        D E
4    F  Area3   BB      F G          G          F

The last row (row 4) is incorrect. Anything that is type BB should only have itself (F) in Group, AA members, CC members. In other words, it should look like this:

4    F  Area3   BB        F          F          F

To do this I tried:

Check when Type is BB and length of Group is 2:

all_df.loc[all_df['Type']== 'BB']['Group'].str.split().str.len() == 2

Iterate over every row and to find the cases like this
Make a new df with all these rows and make the Group, AA members, CC members = Name
Drop the rows where that happens in all_df.
Merge 3 back in to all_df.

Is there a better pandas way to do this?

cottontail · Accepted Answer · 2025-05-05 01:14:55Z

We can use boolean indexing to replace values here. The key here is that we don't really need to check if the Group value is correct or not; we can simply assign the Name to the latter three columns if Type is BB.

# identify rows where Type is BB
m = all_df['Type'] == 'BB'
# for Type BB rows, replace Group, AA members and CC members values by Name
all_df.loc[m, ['Group', 'AA members', 'CC members']] = all_df.loc[m, 'Name']

The above filters the rows where Type is BB and replaces the values in the relevant columns. Another way is to use mask to replace values in the relevant columns.

cols = ['Group', 'AA members', 'CC members']
all_df[cols] = all_df[cols].mask(all_df['Type'] == 'BB', all_df['Name'], axis=0)

Both return the desired output:

  Name   Area Type    Group AA members CC members
0    A  Area1   AA  A B D E        A B        D E
1    B  Area1   AA  A B D E        A B        D E
2    C  Area2   BB        C          C          C
3    E  Area1   CC  A B D E        A B        D E
4    F  Area3   BB        F          F          F

user16836078 · Accepted Answer · 2022-05-24 06:34:05Z

2

You can try iloc and for loop.

for row in all_df.index:
    if all_df.iloc[row,2] == "BB":
        all_df.iloc[row,3:] = all_df["Name"][row]
        
all_df

  Name   Area Type    Group AA members CC members
0    A  Area1   AA  A B D E        A B        D E
1    B  Area1   AA  A B D E        A B        D E
2    C  Area2   BB        C          C          C
3    E  Area1   CC  A B D E        A B        D E
4    F  Area3   BB        F          F          F

answered May 24, 2022 at 6:34

user16836078

2 Comments

chowpay Over a year ago

I only picked the other one as an answer because it didnt have a loop. But this is super cool too. Can you explain how all_df.iloc[row,3:] works? Is it sayin from column 3 and on wards ?

user16836078 Over a year ago

No problem, as long as it solves your problem, it's always a good answer :D and yes, in fact, the values from row 3 till the last column will be replaced (If you are planning to do so). If not, the solution from @not a robot will be a better solution.

Collectives™ on Stack Overflow

How to replace values in a pandas dataframe column based on a condition

2 Answers 2

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related