0

Imagine that I have a single Dataframe as such:

df = pd.DataFrame([[1,2,3,None],[1,2,3,None],[1,2,3,None],[None,2,3,1]], columns=["A","B","C","AA"])
A B C AA
1 2 3
1 2 3
2 3 1

Column AA is actually the same as A, but has suffered a typo somewhere in the data processing pipeline precious steps.

How can I actually rename ['AA'] to ['A'] and move the non-missing values? Example:

A B C
1 2 3
1 2 3
1 2 3

I imagine that if I do:

df['A'] = df['AA']

Null values will be copied.

So, any hints here?

0

3 Answers 3

1

You could try combine_first:

In [8]: df.assign(A=df.A.combine_first(df.AA)).drop(columns='AA')
Out[8]: 
     A  B  C
0  1.0  2  3
1  1.0  2  3
2  1.0  2  3
3  1.0  2  3
Sign up to request clarification or add additional context in comments.

1 Comment

Just had to do some working to index names by square brackets and moving the drop for another line, but it worked. Thanks. Also, more complete answer that works even for non-numerical values.
0

Sum them both together:

df['A'] = df[['A','AA']].sum(axis=1)

Result is:

     A  B  C   AA
0  1.0  2  3  NaN
1  1.0  2  3  NaN
2  1.0  2  3  NaN
3  1.0  2  3  1.0

Comments

0

To add to @mullinscr, first sum the columns and then drop the 'AA' column

df['A'] = df[['A','AA']].sum(axis=1)
df.drop('AA', axis=1, inplace=True)  

2 Comments

Would that work for non-numeric values? While my case is for numbers, that should be considered for a stack overflow answer.
that would not work for non numeric values.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.