Remove a character from a pandas dataframe columns

Question

I have a dataframe as below:

import pandas as pd
import numpy as np
df = pd.DataFrame({'col1':['AA_L8_ZZ', 'AA_L08_YY', 'AA_L800_XX', 'AA_L0008_CC']})
df

    col1
0   AA_L8_ZZ
1   AA_L08_YY
2   AA_L800_XX
3   AA_L0008_CC

I want to remove all 0's after character 'L'. My expected output:

    col1
0   AA_L8_ZZ
1   AA_L8_YY
2   AA_L800_XX
3   AA_L8_CC

bigbounty · Accepted Answer · 2020-08-06 19:26:29Z

3

In [114]: import pandas as pd
     ...: import numpy as np
     ...: df = pd.DataFrame({'col1':['AA_L8_ZZ', 'AA_L08_YY', 'AA_L800_XX', 'AA_L0008_CC']})
     ...: df
Out[114]:
          col1
0     AA_L8_ZZ
1    AA_L08_YY
2   AA_L800_XX
3  AA_L0008_CC

In [115]: df.col1.str.replace("L([0]*)","L")
Out[115]:
0      AA_L8_ZZ
1      AA_L8_YY
2    AA_L800_XX
3      AA_L8_CC
Name: col1, dtype: object

answered Aug 6, 2020 at 19:26

bigbounty

17.5k7 gold badges46 silver badges76 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

sammywemmy · Accepted Answer · 2020-08-06 22:06:03Z

1

Pandas string replace suffices for this. The code below looks for any 0, preceded by L, and replaces the 0 with an empty string :

df.col1.str.replace(r"(?<=L)0+", "")

0      AA_L8_ZZ
1      AA_L8_YY
2    AA_L800_XX
3      AA_L8_CC

If you need more speed, you could go down into plain Python with list comprehension:

import re
df["cleaned"] = [re.sub(r"(?<=L)0+", "", entry) for entry in df.col1]
df
     col1       cleaned
0   AA_L8_ZZ    AA_L8_ZZ
1   AA_L08_YY   AA_L8_YY
2   AA_L800_XX  AA_L800_XX
3   AA_L0008_CC AA_L8_CC

answered Aug 6, 2020 at 22:06

sammywemmy

28.9k4 gold badges21 silver badges35 bronze badges

Collectives™ on Stack Overflow

Remove a character from a pandas dataframe columns

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related