1

I have a pandas DataFrame (20 x 1e6) with several name fields ['PREFIX', 'FIRST_NAME', 'MIDDLE_NAME', 'LAST_NAME', 'SUFFIX'] that I am trying to concatenate into a single field, 'FULLNAME'. The name fields often have whitespace at the beginning or end of the string, and furthermore many records have fields that are empty (ex. suffix = '').

Other answers suggest adding the fields as usual:

df['FULLNAME'] = df['PREFIX'].str.strip() + df['MIDDLE_NAME'].str.strip() + 
df['FIRST_NAME'].str.strip() + df['LAST_NAME'].str.strip() + 
df['SUFFIX'].str.strip()

The only problem here is that if a field is empty, I end up with a double-space in its place.

My (longwinded) solution is the following:

df['FULLNAME'] =  df[['PREFIX', 'FIRST_NAME', 'MIDDLE_NAME', 'LAST_NAME', 
'SUFFIX']].apply(lambda x: ' '.join(' '.join([item.strip() for item in 
x]).split()), axis = 1)

This solution works, but is relatively inefficient given I have over a million rows. Is there a more efficient operation I can do here? I suppose I could add the fields as in the first example, and then replace any number spaces:

df['FULLNAME'] =  df['FULLNAME'].str.replace('  ', ' ')

However, that may not be an encompassing solution given I do not know how many of the name fields may be blank for a given row.

1 Answer 1

2

It's easier to aggregate your columns with agg and then just remove the extras later, using str.replace.

name_cols = ['PREFIX', 'FIRST_NAME', 'MIDDLE_NAME', 'LAST_NAME', 'SUFFIX']
df['FULLNAME'] = df[name_cols].agg(' '.join, axis=1).str.replace('\s+', ' ')
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! Is there any advantage to using agg over apply in this situation?
@LeChase - agg is a little more optimised than apply in this station. They both end up doing the same thing, but agg is supposed to return a Series in any case.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.