1

How can I replace a substring between page1/ and _type-A with 222.6 in the below-provided l string?

l = 'https://homepage.com/home/page1/222.6 a_type-A/go'
replace_with = '222.6'

Expected result:

https://homepage.com/home/page1/222.6_type-A/go

I tried:

import re
re.sub('page1/.*?_type-A','',l, flags=re.DOTALL)

But it also removes page1/ and _type-A.

1
  • Try: re.sub('(?<=page1/).*?(?=_type-A)', replace_with, l) Commented Sep 28, 2022 at 10:32

3 Answers 3

2

You may use re.sub like this:

import re

l = 'https://homepage.com/home/page1/222.6 a_type-A/go'
replace_with = '222.6'

print (re.sub(r'(?<=page1/).*?(?=_type-A)', replace_with, l))

Output:

https://homepage.com/home/page1/222.6_type-A/go

RegEx Demo

RegEx Breakup:

  • (?<=page1/): Lookbehind to assert that we have page1/ at previous position
  • .*?: Match 0 or more of any string (lazy)
  • (?=_type-A): Lookahead to assert that we have _type-A at next position
Sign up to request clarification or add additional context in comments.

Comments

2

You can use

import re
l = 'https://'+'homepage.com/home/page1/222.6 a_type-A/go'
replace_with = '222.6'
print (re.sub('(page1/).*?(_type-A)',fr'\g<1>{replace_with}\2',l, flags=re.DOTALL))

Output: https://homepage.com/home/page1/222.6_type-A/go

See the Python demo online

Note you used an empty string as the replacement argument. In the above snippet, the parts before and after .*? are captured and \g<1> refers to the first group value, and \2 refers to the second group value from the replacement pattern. The unambiguous backreference form (\g<X>) is used to avoid backreference issues since there is a digit right after the backreference.

Since the replacement pattern contains no backslashes, there is no need preprocessing (escaping) anything in it.

Comments

1

This works:

import re

l = 'https://homepage.com/home/page1/222.6 a_type-A/go'
pattern = r"(?<=page1/).*?(?=_type)"
replace_with = '222.6'

s = re.sub(pattern, replace_with, l)
print(s)

The pattern uses the positive lookahead and lookback assertions, ?<= and ?=. A match only occurs if a string is preceded and followed by the assertions in the pattern, but does not consume them. Meaning that re.sub looks for a string with page1/ in front and _type behind it, but only replaces the part in between.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.