How to copy a field based on a condition of another column in Python?

Question

I need to copy a column's field into a variable, based on a specific condition, and then delete it.

This dataframe contains data of some kids, that have their favourite toy and colour associated:

data = {'Kid': ['Richard', 'Daphne', 'Andy', 'May', 'Claire', 'Mozart', 'Jane'],
        'Toy':  ['Ball', 'Doll', 'Car', 'Barbie', 'Frog', 'Bear', 'Doll'],
        'Colour': ['white', np.nan, 'red', 'pink', 'green', np.nan, np.nan]
        }

df = pd.DataFrame (data, columns = ['Kid', 'Toy','Colour'])

print (df)

The dataframe looks like this:

       Kid       Toy Colour
0  Richard      Ball  white
1   Daphne      Doll    NaN
2     Andy       Car    red
3      May    Barbie   pink
4   Claire      Frog  green
5   Mozart      Bear    NaN
6     Jane      Doll    NaN

The condition is: If a kid does have a toy, but it does not have a colour, then save both the kid and the toy in a separate array as follows and maintain the order/matching:

toy_array = ["Doll", "Bear", "Doll"]
kid_array = ["Daphne", "Mozart", "Jane"]

And then delete the toy from the dataframe. So the final dataframe should look like this:

       Kid     Toy Colour
0  Richard    Ball  white
1   Daphne     NaN    NaN
2     Andy     Car    red
3      May  Barbie   pink
4   Claire    Frog  green
5   Mozart     NaN    NaN
6     Jane     NaN    NaN

I got inspired by many sources, along with this one, and I tried this:

kid_array.append(df.loc[(df['Toy'] != np.nan) & (df['Colour'] == np.nan)])
print(kid_array)

I am at the very beginning, I highly appreciate all your help if you could possibly help me!

Wau, so nice question - input data sample, ouput data sample, your code what you try. Super! Happy coding! — jezrael
– jezrael, Commented Mar 30, 2021 at 11:47

jezrael · Accepted Answer · 2021-03-30 11:30:54Z

1

Test missing and no misisng values by Series.isna and Series.notna and then set missing values to Toy column by DataFrame.loc:

mask = df['Toy'].notna() & df['Colour'].isna()

df.loc[mask, 'Toy'] = np.nan

Or in Series.mask:

df['Toy'] = df['Toy'].mask(mask)

Or by numpy.where:

df['Toy'] = np.where(mask, np.nan, df['Toy'])

print (df)
       Kid     Toy Colour
0  Richard    Ball  white
1   Daphne     NaN    NaN
2     Andy     Car    red
3      May  Barbie   pink
4   Claire    Frog  green
5   Mozart     NaN    NaN
6     Jane     NaN    NaN

If need lists:

toy_array = df.loc[mask, 'Toy'].tolist()
kid_array = df.loc[mask, 'Kid'].tolist()

print (toy_array)
['Doll', 'Bear', 'Doll']

print (kid_array)
['Daphne', 'Mozart', 'Jane']

edited Mar 30, 2021 at 11:30

answered Mar 30, 2021 at 11:25

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Elisa L. Over a year ago

Thank you very much!! I learned so much from your answer!

Dattaprasad Ekavade · Accepted Answer · 2021-03-30 11:25:39Z

0

Your logic is correct, just the function to compare needs to be matched with the correct function used for comparison in Numpy Library

numpy.isnan()

Try the following code

kid_array.append(df.loc[(!numpy.isnan( df['Toy'])) & (!numpy.isnan(df['Colour']))])

answered Mar 30, 2021 at 11:25

Dattaprasad Ekavade

1281 gold badge3 silver badges10 bronze badges

Collectives™ on Stack Overflow

How to copy a field based on a condition of another column in Python?

2 Answers 2

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related