0

I have a column of a Pandas DataFrame called "Steuersatz".

This column is made of the following unique strings:

array(['19,00%', '0,00%', '5,00%', '4,64%', '4,04%', '4,10%', '1,63%', '3,55%',
       '1,14%', '0,96%', '11,31%', '12,35%', '10,45%', '11,00%', '12,99%',
       '10,83%', '6,82%', '11,50%', '16,00%', '3,30%', '4,00%', '4,16%',
       '4,15%', '10,38%', '11,43%', '11,58%'], dtype=object)

I am trying to match patterns such that if the number is 19,00 or anything with 00 at the end, it should instead display 19% or just that digit and %

Here is what I am doing to solve this problem:

df["Steuersatz"] = df["Steuersatz"].map("{:,.2f}%".format)
df["Steuersatz"] = df["Steuersatz"].str.replace(".",",")
df['Steuersatz'] = df['Steuersatz'].str.replace("19,00%","19%")
df['Steuersatz'] = df['Steuersatz'].str.replace("0,00%","0%")
df['Steuersatz'] = df['Steuersatz'].str.replace("11,00%","11%")
df['Steuersatz'] = df['Steuersatz'].str.replace("5,00%","5%")
df['Steuersatz'] = df['Steuersatz'].str.replace("4,00%","4%")
df['Steuersatz'] = df['Steuersatz'].str.replace("16,00%","16%")

To me, this is inefficient, I am looking at doing this automatically rather than checking to manually replace.

Many thanks for your input

1 Answer 1

2

Why not just replace ,00 with an empty string? pd.Series.str.replace is able to handle regex (actually does that by default) and thus can perform partial matching:

df['Steuersatz'] = df['Steuersatz'].str.replace(",00","")

This not only removes several repeated lines from your code but also handles new cases, say 23,00%

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, seems great!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.