0

I have a pandas dataframe df as below:

       A    B
0   70.0   20
1    NaN   20
2   28.0  100
3   75.0  120
4   56.0   30
5   84.0   90
6    NaN  100
7   19.0   10
8   93.0   80
9   94.0   70
10  72.0   20

I am trying to change the values of A as an average value based on B’s A. For instance, for B = 20, I would like all A values to be an average of 70 and 72 ignoring NaN. What is the best way possible please? I am thinking along the groupby lines as in…

df['AA']=df.groupby('B')['A'].transform(lambda s: s=s.mean())

That did not help.

1
  • Added a solution, does that answer your question? Commented Jul 16, 2022 at 0:47

2 Answers 2

1

mean by default ignores NaNs... so the simplest method would just be:

df['AA'] = df.groupby('B').transform('mean')

Output:

       A    B    AA
0   70.0   20  71.0
1    NaN   20  71.0
2   28.0  100  28.0
3   75.0  120  75.0
4   56.0   30  56.0
5   84.0   90  84.0
6    NaN  100  28.0
7   19.0   10  19.0
8   93.0   80  93.0
9   94.0   70  94.0
10  72.0   20  71.0
Sign up to request clarification or add additional context in comments.

Comments

1

IIUC

created column A2, just so as to have a reference of what 'A' was. you can always updated back the column 'A'

df['AA']=df[~df['A'].isnull()].groupby('B')['A'].transform('mean')
df
       A      B       AA
0   70.0     20     71.0
1    NaN     20      NaN
2   28.0    100     28.0
3   75.0    120     75.0
4   56.0     30     56.0
5   84.0     90     84.0
6    NaN    100      NaN
7   19.0     10     19.0
8   93.0     80     93.0
9   94.0     70     94.0
10  72.0     20     71.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.