2

I have a dataframe with the below structure,

   master_mac    slave_mac        uuid           rawData               
0  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                         
1  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                         
2  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                          
3  ac233fc01403  ac233f26492b     e2c56db5       ac0228  
4  ac233fc01403  e464eecba5eb     NaN            590080             
5  ac233fc01403  ac233f26492b     e2c56db5       ac0228  
6  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                          
7  ac233fc01403  ac233f26492b     e2c56db5       636800       
  • If an "uuid" column is not empty for a group i.e., "master_mac" & "slave_mac", then the respective rows should contain NaN for "rawData" column.

The resultant outcome needs to be,

 master_mac    slave_mac        uuid           rawData               
0  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                         
1  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                         
2  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                          
3  ac233fc01403  ac233f26492b     e2c56db5       NaN  
4  ac233fc01403  e464eecba5eb     NaN            590080             
5  ac233fc01403  ac233f26492b     e2c56db5       NaN  
6  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                          
7  ac233fc01403  ac233f26492b     e2c56db5       NaN

Can anyone help me out in this?

3 Answers 3

2

Use:

m = df['uuid'].notna()

If need processes per groups use GroupBy.transform with GroupBy.any for test at least one non NaN per groups:

m = df['uuid'].notna().groupby([df['master_mac'],df['slave_mac']]).transform('any')

df['rawData'] = df['rawData'].mask(m)
print (df)
     master_mac     slave_mac      uuid rawData
0  ac233fc01403  ac233f26492b  e2c56db5     NaN
1  ac233fc01403  ac233f26492b  e2c56db5     NaN
2  ac233fc01403  ac233f26492b  e2c56db5     NaN
3  ac233fc01403  ac233f26492b  e2c56db5     NaN
4  ac233fc01403  e464eecba5eb       NaN  590080
5  ac233fc01403  ac233f26492b  e2c56db5     NaN
6  ac233fc01403  ac233f26492b  e2c56db5     NaN
7  ac233fc01403  ac233f26492b  e2c56db5     NaN

Or:

df.loc[m, 'rawData'] = np.nan
Sign up to request clarification or add additional context in comments.

Comments

0

If you need to modify for each row the value in column rawData based on the value in column uuid, you could simply do this:

df['rawData'].loc[df['uuid'].notna()] = np.nan

Comments

0
duckdb

df1.sql.select("master_mac,slave_mac,uuid,case when uuid is null then rawData end rawData")

┌──────────────┬──────────────┬──────────┬─────────┐
│  master_mac  │  slave_mac   │   uuid   │ rawData │
│   varchar    │   varchar    │ varchar  │ varchar │
├──────────────┼──────────────┼──────────┼─────────┤
│ ac233fc01403 │ ac233f26492b │ e2c56db5 │ NULL    │
│ ac233fc01403 │ ac233f26492b │ e2c56db5 │ NULL    │
│ ac233fc01403 │ ac233f26492b │ e2c56db5 │ NULL    │
│ ac233fc01403 │ ac233f26492b │ e2c56db5 │ NULL    │
│ ac233fc01403 │ e464eecba5eb │ NULL     │ 590080  │
│ ac233fc01403 │ ac233f26492b │ e2c56db5 │ NULL    │
│ ac233fc01403 │ ac233f26492b │ e2c56db5 │ NULL    │
│ ac233fc01403 │ ac233f26492b │ e2c56db5 │ NULL    │
└──────────────┴──────────────┴──────────┴─────────┘

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.