0

I have a dataframe and it contains some values like

the change (♠)
and the new (⦻)

my desired output is

the change
and the new

I have tried to use

df.columns = df.columns.str.strip(' ()')
df=df.replace('\()','',regex=False)

but nothing worked, can anyone help? Thanks

1
  • 2
    Try df = df.replace('\(\)','',regex=True) Commented Jun 8, 2021 at 16:18

2 Answers 2

2

If you have dataframe:

              col1           col2
0           value1  the change ()
1    the change ()         value3
2           value2         value4
3  () other change            NaN

You can replace the () in whole dataframe:

df = df.apply(lambda x: x.str.replace(r"\s*\(\)\s*", "", regex=True))
print(df)

Prints:

           col1        col2
0        value1  the change
1    the change      value3
2        value2      value4
3  other change         NaN

EDIT: If you have df:

              col1            col2
0           value1  the change (♠)
1   the change (⦻)          value3
2           value2          value4
3  () other change             NaN

Then:

df = df.apply(lambda x: x.str.replace(r"\s*\(.*?\)\s*", "", regex=True))
print(df)

Prints:

           col1        col2
0        value1  the change
1    the change      value3
2        value2      value4
3  other change         NaN
Sign up to request clarification or add additional context in comments.

8 Comments

I have a little update in the question, can you again help? Thanks
then I am getting "AttributeError: Can only use .str accessor with string values!" this error
@sdave You have probably some numeric columns. you can do df['column name'] = df['column name'].apply(lambda x: x.str.replace(r"\s*\(.*?\)\s*", "", regex=True))
I tried already with the way you suggested above comment. df = df.replace(r"\s*\(.*?\)\s*", '', regex = True, inplace = False) this way it worked, I am trying to double check if the results are fine as the table is huge and then will update. But I am not sure how inplace = False worked
@sdave This will match all text inside the ( ) - you can play with regular expression here for example: regex101.com/r/87WnG2/1
|
1

You have almost done it. Just change regex = True in your code and modify the regex to remove the spaces as well.

Input dataset

         col1             col2
0   change ()             val1
1        val2  samplestring ()
2  change 2()          val 5()
df.replace(r'\s*\(\s*\)\s*', '', regex = True, inplace = True)

Output dataset:

       col1          col2
0    change          val1
1      val2  samplestring
2  change 2         val 5

6 Comments

I have updated my question, can you have a look please
In this case, we get empty dataframe :(
Yes, because it was only looking for blank spaces between (). You can try this: df.replace(r'\s*\(.*?\)\s*', '', regex = True, inplace = True)
stackoverflow.com/questions/67907458/… can you have a look at this please
df = df.replace(r"\s*(.*?)\s*", '', regex = True, inplace = False) this way it worked. But I am not sure how inplace = False worked ?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.