2

Hello (Pandas Python) to make it short, I have a data frame composed of a user id column (user_id), its organization attached to it in the second column, and its organization merged in the third column, obviously in the third column all have no orgnization merged and therefore have Na it may also be that the same base_org is repeated but that this one has no merge and this is desired, the data frame looks like this:

User_id Base_org Merge_org
A Apple Na
B Instagram Facebook
C Xbox Microsoft
D Google Na
E Instagram Na

I would like users who have Na to keep their base_org but those who have merged companies to take the place of their base_org like this:

User_id Base_org Merge_org
A Apple Na
B Facebook Facebook
C Microsoft Microsoft
D Google Na
E Instagram Na

How can I proceed ?

0

4 Answers 4

5

A np.where option:

df['Base_org'] = np.where(
    df['Merge_org'].eq('Na'), df['Base_org'], df['Merge_org']
)

df:

  User_id   Base_org  Merge_org
0       A      Apple         Na
1       B   Facebook   Facebook
2       C  Microsoft  Microsoft
3       D     Google         Na
4       E  Instagram         Na
Sign up to request clarification or add additional context in comments.

Comments

2

Try:

df['Base_org'] = df.mask(df['Merge_org'] == 'Na')['Merge_org'].fillna(df['Base_org'])
df

Output:

  User_id   Base_org  Merge_org
0       A      Apple         Na
1       B   Facebook   Facebook
2       C  Microsoft  Microsoft
3       D     Google         Na
4       E  Instagram         Na

Comments

1

I don't know your actual intent but it's often better to overwrite na's in the merged value with base values instead of overwriting "base" values with the non-null merged-in values.

You can solve your direct question with a simple df.loc() statement.

df.loc[df.Merge_org != "Na", 'Base_org'] = df.Merge_org

Output:

  User_id   Base_org  Merge_org
0       A      Apple         Na
1       B   Facebook   Facebook
2       C  Microsoft  Microsoft
3       D     Google         Na
4       E  Instagram         Na

This method and similar ones erase the fact that the base values were actually Instagram and XBox for the two rows. If you're going to keep all three columns, then you could replace from left to right like this, preserving original and new data.

df.loc[df.Merge_org == "Na", 'Merge_org'] = df.Base_org

Output:

  User_id   Base_org  Merge_org
0       A      Apple      Apple
1       B  Instagram   Facebook
2       C       Xbox  Microsoft
3       D     Google     Google
4       E  Instagram  Instagram

This output works better for debugging and further development.

3 Comments

Indeed it is simpler to do as you said, at the time it seemed clear to me but having done with your method it greatly facilitated my task I thank you
You're welcome. Why did you accept an answer that is harder to code and takes longer to execute?
because it is a function that I can call back when I want it via a class. But your answer is just as relevant
0
def replace_base_org(base_org, merge_org):
    return merge_org if merge_org != 'Na' else base_org 


df['Base_org'] = df.apply(lambda row : replace_base_org(row['Base_org'],row['Merge_org']), axis = 1)

2 Comments

Thank you for answering, I did not know the function apply I realize that it is very useful!
sorry it's a notation error on my part your answer is correct :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.