1

This has a reference to this SO thread .

For the sake of newness, I am reproducing the dataframe with a small change:

ID         Static_Text                                           Params
1      Today, <adj1> is quite Sunny. Tomorrow, <adj2>           1-10-2020  
       may be little <adj3>
1      Today, <adj1> is quite Sunny. Tomorrow, <adj2>           2-10-2020
       may be little <adj3>
1      Today, <adj1> is quite Sunny. Tomorrow, <adj2>           Cloudy
       may be little <adj3>
2      Let's have a coffee break near <adj1>, if I              Balcony
       don't get any SO reply by <adj2>
2      Let's have a coffee break near <adj1>, if I               30
       don't get any SO reply by <adj2> mins

Now I want to replace the whole <adj> by {} where 1st occurrence of i.e. <adj1> shall be replaced by {0}. So the resultant dataframe would look like follows:

ID         Static_Text                                           Params
1      Today, {0} is quite Sunny. Tomorrow, {1}              1-10-2020  
       may be little {2}
1      Today, {0} is quite Sunny. Tomorrow, {1}              2-10-2020
       may be little {2}
1      Today, {0} is quite Sunny. Tomorrow, {1}              Cloudy
       may be little {2}
2      Let's have a coffee break near {0}, if I              Balcony
       don't get any SO reply by {1}
2      Let's have a coffee break near {0}, if I              30
       don't get any SO reply by {1} mins 

I am trying the following:

def replace_angular(df):
   if '<' and '>' in df['Static_Text']:
       rep_txt = re.sub(r'\<[^>]*\>',{},df[Static_Text'])
   return rep_txt

df = df.apply(lambda x : replace_angular(x),axis=1)

But I am not too sure about the above code snippet. Especially how to bring 0,1 etc within {}.

2 Answers 2

1

IIUC you can pass a lambda function in str.replace as replacement:

df["Static_Text"].str.replace(r"<[A-Za-z]+(\d+)>", lambda m: '{'+f'{int(m.group(1))-1}'+'}')

0                     Today, {0} is quite Sunny. Tomorrow, {1} may be little {2}
1                     Today, {0} is quite Sunny. Tomorrow, {1} may be little {2}
2                     Today, {0} is quite Sunny. Tomorrow, {1} may be little {2}
3         Let's have a coffee break near {0}, if I don't get any SO reply by {1}
4    Let's have a coffee break near {0}, if I don't get any SO reply by {1} mins
Sign up to request clarification or add additional context in comments.

3 Comments

Many thanks. Two questions: 1. Can I use my regex while you are using str.replace(r"<adj(\d)>" part? Because the word adj might change. 2. What exactly {int(m.group(0)[-2])-1} is doing? What if instead of 0,1,2 I have 0....15?
For regex i'd probably use r"<[A-Za-z]+(\d)>" instead. I updated the above and use m.group(1) instead and it should adapt to any number.
Will your regex capture any space or special character within <adj>? Like <some# tag>? Can I use r"<[A-Za-z0-9.#$@]>" this regex instead?
0

if the number in adj over 9, should change "m.group(0)[-2]" to "m.group(0)[4:-1]".

print (df["Static_Text"].str.replace(r"<adj(\d)>", lambda m: "{"+f"{int(m.group(0)[4:-1])-1}"+"}"))

1 Comment

Many Thanks. I dont want that adj to be part of the regex as this might change. Secondly, what if I have 15 parameters i.e. {0},{1}...{15}? How int(m.group(0)[4:-1] gonna help?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.