3

I have a dataframe with two country descriptions. Sometimes they match, sometimes they don't.

Country Desc1        Country Desc2

1    US           US  
2    US           UK           
3    UK           US    
4    UK           UK

I need to 1.) insert another column (Country Desc3) with all the row values populated with 2.) a rule that returns Country Desc1 if it matches Country Desc2.

2

4 Answers 4

2
df['Country Desc3'] = \
    df['Country Desc1'].mask(df['Country Desc1'] != df['Country Desc2'])

df

  Country Desc1 Country Desc2 Country Desc3
0            US            US            US
1            US            UK           NaN
2            UK            US           NaN
3            UK            UK            UK
Sign up to request clarification or add additional context in comments.

Comments

1

Let's use iloc and join:

df['Country Desc3'] = df.apply(lambda x: x.iloc[0] if x.iloc[0] == x.iloc[1] else ', '.join(x),axis=1)

Output:

  Country Desc1 Country Desc2 Country Desc3
1            US            US            US
2            US            UK        US, UK
3            UK            US        UK, US
4            UK            UK            UK

Comments

1

Try this , if you need a string in new column

df['Country Desc3']=df.apply(lambda x: ','.join(x.unique().tolist()), axis=1)

If you need a list in new column

df['Country Desc3']=df.apply(lambda x: x.unique().tolist(), axis=1)

Just in case you need NaN for no matched row

df['Country Desc3']=np.nan
df.loc[df['Country Desc1']==df['Country Desc2'],'Country Desc3']=df['Country Desc1']

Comments

1

You can do that using numpy.where as follows:

df['Country Desc3'] = np.where(df['Country Desc1']==df['Country Desc2'],df['Country Desc1'],np.nan)

this will give you:

  Country Desc1 Country Desc2 Country Desc3
1            US            US            US
2            US            US            US
3            UK            US           NaN
4            UK            UK            UK

if you don't like to have NaN values just change np.nan by whatever you like, for example : df['Country Desc1']+', '+df['Country Desc2'] to have the concat of the two columns when they don't match.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.