1

I try to sum up columns with string data. The Problem is that I want to ignore the NaN, but I didn't find a solution.

The Dataframe look like this:

s=pd.DataFrame({'A':['(Text,','(Text1,'],'B':['(Text2,','(Text3,'],'C':['(Text4,','(Text5,']})


        A        B        C
0   (Text,  (Text2,  (Text4,
1  (Text1,  (Text3,  (Text5,

First I delete the brackets and commas with:

sA = s['A'].str.lstrip('(').str.rstrip(',')
sB = s['B'].str.lstrip('(').str.rstrip(',')
sC = s['C'].str.lstrip('(').str.rstrip(',')

And then I put the columns together.

sNew = sA + ' ' +  sB + ' ' + sC

print sNew
0   Text Text2 Text4
1  Text1 Text3 Text5

1. Is there a better way to sum up the columns? I have the feeling that this way isn't really efficient. I tried the str.lstrip for all columns but it doesn't work.

2. If I have a NaN in a Cell, the row will be NaN. How can I ignore the NaN in this spezific case? e.g.

    A        B        C
0   (Text,  (Text2,  (Text4,
1  (Text1,  (Text3,  NaN

and my result is after delete the brackets and sum up...

0   Text Text2 Text4
1   NaN

but I want the following result...

0   Text Text2 Text4
1  Text1 Text3 

It will be great if you have some tips for me to solve the problem!

2 Answers 2

0

I think you can use Kiwi solution, where is added removing (, by .strip('(,'):

import pandas as pd
import numpy as np

s=pd.DataFrame({'A':['(Text,','(Text1,'],
                'B':[np.nan,'(Text3,'],
                'C':['(Text4,',np.nan]})
print(s)

         A        B        C
0   (Text,      NaN  (Text4,
1  (Text1,  (Text3,      NaN

def concat(*args):
    strs = [str(arg).strip('(,') for arg in args if not pd.isnull(arg)]
    return ','.join(strs) if strs else np.nan
np_concat = np.vectorize(concat)

s['new'] = np_concat(s.A, s.B, s.C)
print (s)
         A        B        C          new
0   (Text,      NaN  (Text4,   Text,Text4
1  (Text1,  (Text3,      NaN  Text1,Text3
Sign up to request clarification or add additional context in comments.

1 Comment

Thats what I need. Thanks!
0

You can fill the null values of your dataframe with empty strings before computing the new column. Use fillna like this:

s.fillna('',inplace = True)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.