2

I have a string as such testing_7_3_4_testing

i want to replace testing_7_3_4_testing with testing_7.3.4_testing, i have tried using str.replace(/\d_\d/, ".") and im getting some really weird results. Regex experts please help!

2
  • 1
    Maybe the results are weird because you are using JS code in Python? Commented Jan 26, 2016 at 0:58
  • i also tried this re.findall(r'[\d_\d]', str) to but it only matches first set "7_3" and not "7_3_4" Commented Jan 26, 2016 at 1:00

2 Answers 2

4

Try this:

import re

my_strs = [
    'testing_7_3_4_testing',
    'testing_7_3_testing',
    'testing_7_3_4_5',
    'testing_71_312_4123_testing',
]

pattern = r"""
    (\d+)      #Match a digit, one or more times, captured in group 1, followed by...
    _          #an underscore, followed by...
    (?=\d+)    #a digit, one or more times, but do not include as part of the match
"""

for my_str in my_strs:
    new_str = re.sub(pattern, r'\1.', my_str, flags=re.X)
    print(new_str)

--output:--
testing_7.3.4_testing
testing_7.3_testing
testing_7.3.4.5
testing_71.312.4123_testing

The pattern (?=\d+) says to match a digit, one or more times, but do not actually include the matching digits as part of the match.

Sign up to request clarification or add additional context in comments.

4 Comments

this one is much more clever solution
Could you explain the pattern please?
@RafaelGutierrez, I added some comments to the pattern. The tricky part is (?=...), which is called a lookahead. It matches something further ahead in the string--but does not include that part of the match in the actual match.
This answer is only good when you do not care about any custom boundaries (well, same as the other answer). If you need to replace _ only between numbers in a string right after testing_, you'd better use a testing(?:_\d+)+ regex and then replace each _ within a lambda expression.
2

Save each digit into it's own saving group, reference the groups in your replacement string:

>>> import re
>>> s = "testing_7_3_4_testing"
>>> re.sub(r"(\d)_(\d)_(\d)", r"\1.\2.\3", s)
'testing_7.3.4_testing'

Or, we can make use of a replacement function, which, in contrast to the first approach, also handles variable number of digits in the input string:

>>> def replacement(m):
...     x, y, z = m.groups()
...     return x + y.replace("_", ".") + z
... 
>>> re.sub(r"(.*?_)([0-9_]+)(_.*?)", replacement, s)
'testing_7.3.4_testing'

A non-regex approach would involve splitting by _, slicing and joining:

>>> l = s.split("_")
>>> l[0] + "_" + ".".join(l[1:-1]) + "_" + l[-1]
'testing_7.3.4_testing'

8 Comments

What if there are more than 3 numbers?
@WiktorStribiżew this particular pattern does not scale much, good point, but the digits look like a part of a version, let's see if this is good enough for the OP. Thanks.
what about a variable number of separated underscored numbers? i.e testing_7_3_4_testing or testing_3_4_testing etc...
@WiktorStribiżew okay, updated with a more scalable option.
@RafaelGutierrez the second option should cover variable numbers in the input string. Thanks.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.