1

INPUT>df1

    ColumnA ColumnB
     A1       NaN
     A1A2     NaN
     A3       NaN       

What I tried to do is to change column B's value conditionally, based on iteration of checking ColumnA, adding remarks to column B. The previous value of column B shall be kept after new string is added.

In sample dataframe, what I want to do would be

  • If ColumnA contains A1. If so, add string "A1" to Column B (without cleaning all previous value.)
  • If ColumnA contains A2. If so, add string "A2" to Column B (without cleaning all previous value.)

OUTPUT>df1

    ColumnA ColumnB
     A1       A1
     A1A2     A1_A2
     A3       NaN       

I have tried the following codes but not working well. Could anyone give me some advices? Thanks.

df1['ColumnB'] = np.where(df1['ColumnA'].str.contains('A1'), df1['ColumnB']+"_A1",df1['ColumnB'])
df1['ColumnB'] = np.where(df1['ColumnA'].str.contains('A2'), df1['ColumnB']+"_A2",df1['ColumnB'])

2 Answers 2

3

One way using pandas.Series.str.findall with join:

key = ["A1", "A2"]
df["ColumnB"] = df["ColumnA"].str.findall("|".join(key)).str.join("_")
print(df)

Output:

  ColumnA ColumnB
0      A1      A1
1    A1A2   A1_A2
2      A3        
Sign up to request clarification or add additional context in comments.

Comments

1

You cannot add or append strings to np.nan. That means you would always need to check if any position in your ColumnB is still a np.nan or already a string to properly set its new value. If all you want to do is to work with text you could initialize your ColumnB with empty strings and append selected string pieces from ColumnA as:

import pandas as pd
import numpy as np

I = pd.DataFrame({'ColA': ['A1', 'A1A2', 'A2', 'A3']})
I['ColB'] = ''
I.loc[I.ColA.str.contains('A1'), 'ColB'] += 'A1'
print(I)

I.loc[I.ColA.str.contains('A2'), 'ColB'] += 'A2'
print(I)

The output is:

   ColA ColB
0    A1   A1
1  A1A2   A1
2    A2     
3    A3     
   ColA  ColB
0    A1    A1
1  A1A2  A1A2
2    A2    A2
3    A3      

Note: this is a very verbose version as an example.

1 Comment

Solved! As you mentioned, the problem occured due to appending strings to np.nan

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.