2

I have this df with 9 columns

       x      y1_x      y2_x        y3_x         y4_x      
0  -17.7 -0.785430       NaN         NaN          NaN  
1  -15.0       NaN       NaN         NaN -3820.085000    
2  -12.5       NaN       NaN    2.138833          NaN   
3  -12.4       NaN       NaN    1.721205          NaN    
4  -12.1       NaN  2.227343    2.227343          NaN  

           d1        d2          d3          d4 
0    0.053884       NaN         NaN         NaN  
1         NaN       NaN         NaN    0.085000  
2         NaN       NaN    0.143237         NaN  
3         NaN       NaN    0.251180         NaN  
4         NaN  0.127343    0.440931         NaN  

Between y1_x and y4_x I can only have 1 non NaN value per row.

The condition to choose which value is removed is explained in this example:

In row 4 there are 2 values between y1_x and y4_x

The value that becomes NaN is the one from y3_x because in that same row, d3 > d2

2 Answers 2

2

you can use:

#get count of nans between ('y1_x', 'y2_x', 'y3_x', 'y4_x', 'd1', 'd2', 'd3', 'd4')
final['mask']=final.iloc[:,1:5].isna().sum(axis=1)

#if the mask is 2, it means it will be filled with nan.
count=len(final[final['mask']==2])

#We enter the loop as many as the number of rows with a mask value of 2.
for i in range(0,count):
    mask=final[final['mask']==2].iloc[[i]] #get nth row

    #get columns (('y1_x', 'y2_x', 'y3_x', 'y4_x', 'd1', 'd2', 'd3', 'd4')) and drop nans
    abc=mask[mask['mask']==2].iloc[:,1:8].dropna(axis=1,how='all')
    to_replace_col_name=abc.iloc[:,2:4].max(axis=0).reset_index()['index'][1] #which column is bigger ?

    #Once we know the name of the big column, the 2 columns before this column is the one we will change.

    replace_col=abc.columns.get_loc(to_replace_col_name) 
    replace_col=abc.columns[replace_col - 2]

    #now we know which column to change
    mask[replace_col]=np.nan
    final.loc[mask.index,:]=mask #replace row according to index name

print(final)
'''
      x     y1_x      y2_x      y3_x  ...        d2        d3     d4  mask
0 -17.7 -0.78543       NaN       NaN  ...       NaN       NaN    NaN     3
1 -15.0      NaN       NaN       NaN  ...       NaN       NaN  0.085     3
2 -12.5      NaN       NaN  2.138833  ...       NaN  0.143237    NaN     3
3 -12.4      NaN       NaN  1.721205  ...       NaN  0.251180    NaN     3
4 -12.1      NaN  2.227343       NaN  ...  0.127343  0.440931    NaN     2

'''
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! I had a trace back error yesterday when I tried this but for some reason I reopened Pycharm today and It worked perfectly without changing a thing
2

Another possible solution:

auxd = df.filter(like='d')
auxy = df.filter(like='y')
auxd.columns = auxy.columns
colmin = auxd.idxmin(axis=1)

df[auxy.columns] = auxy.apply(lambda x: x.where(colmin.eq(x.name)))

Output:

      x     y1_x      y2_x      y3_x      y4_x        d1        d2        d3     d4
0 -17.7 -0.78543       NaN       NaN       NaN  0.053884       NaN       NaN    NaN
1 -15.0      NaN       NaN       NaN -3820.085       NaN       NaN       NaN  0.085
2 -12.5      NaN       NaN  2.138833       NaN       NaN       NaN  0.143237    NaN
3 -12.4      NaN       NaN  1.721205       NaN       NaN       NaN  0.251180    NaN
4 -12.1      NaN  2.227343       NaN       NaN       NaN  0.127343  0.440931    NaN

1 Comment

Thank you! I tried this method as well, worked perfectly.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.