13

I have a pandas DF that has many string elements that contains words like this:

'Frost                              '

Which has many leading white spaces in front of it. When I compare this string to:

'Frost'

I realized that the comparison was False due to the leading spaces.

Although I can solve this by iterating over every element of the pandas DF, the process is slow due to the large number of records I have.

This other approach should work, but it is not working:

rawlossDF['damage_description'] = rawlossDF['damage_description'].map(lambda x: x.strip(''))

So when I inspect an element:

rawlossDF.iloc[0]['damage_description']

It returns:

'Frost                              '

What's going on here?

2 Answers 2

25

Alternatively you could use str.strip method:

rawlossDF['damage_description'] = rawlossDF['damage_description'].str.strip()
Sign up to request clarification or add additional context in comments.

1 Comment

I tried this on a 5M rows dataset and it takes twice the time compared to map+lambda
24

Replace your function with this:

rawlossDF['damage_description'] = rawlossDF['damage_description'].map(lambda x: x.strip())

You almost had it right, you needed to get rid off the '' inside strip()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.