2

There is a column (spot_categories_name) in the dataframe like the one below. My goal is to get rid of the 'name' at the beginning and the parenthesis (}]") at the end. Briefly, I want to edit the following

Craftsman

BBQ

Theatre

Coffee Shop

...

enter image description here

2
  • 1
    df['spot_categories_name'] = df['spot_categories_name'].map(lambda x: x.lstrip('\'name\': ')) see here. Also, instead of pasting a picture, it would be helpful to see the dataframe pasted directly. Commented Jan 21, 2021 at 18:06
  • 1
    It seems like this dataframe was generated inefficiently. You should try to generate a dataframe correctly in the first place. Commented Jan 21, 2021 at 18:08

2 Answers 2

3

Use .str.extract():

df['spot_categories_name'] = df['spot_categories_name'].str.extract(r'\'name\': \'([^\']*)\'')
Sign up to request clarification or add additional context in comments.

2 Comments

useful to see regex solution i actually think a bit more elegant than the way I proposed if it does indeed match the test cases.
ah, I missed "than" when I was ready quickly haha.
1

If you use pandas .str.split method it can split your string into arrays wherever it meets this character.

You can then use .str[n] to get the nth entry in these arrays. In your case you can slit on :' and '} and then the last and first entries after split and it seems to match your test cases. Here is an example below.

import pandas as pd
test = pd.DataFrame(data = ["'name': 'Craftman'}]","'name': 'BBQ'}]"],columns=['spot_categories_name'])
test.spot_categories_name.str.split(": '").str[-1].str.split("'}").str[0]
print(test.to_dict())
#{'spot_categories_name': {0: "'name': 'Craftman'}]", 1: "'name': 'BBQ'}]"}}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.