2

I have byteslike 'foo\x20\x20\x08\x08bar'

I need have the backspaces ('\x08') evaluated when and only when they are lead by identical number of spaces ('\x20').

x = re.sub('\x20+\x08+', '', t) is the naive way of doing this, but fails to produce correct output when t = 'foo\x20\x20\x08'

Is there a way to define a regular expression that takes the length of a previous group in to account when matching the second group or do I need do this manually with re.finditer & re.span() and then manually re-checking the preceding blocks?

1 Answer 1

2

An alternative is to pass a lambda to re.sub:

>>> pat ='(\x20+)(\x08+)' 
>>> repl = lambda m: m.group(1)[:-len(m.group(2))]

now:

>>> re.sub(pat, repl, 'foo\x20\x20\x08bar')
'foo bar'
>>> re.sub(pat, repl, 'foo\x20\x20\x08\x08bar')
'foobar'
>>> re.sub(pat, repl, 'foo\x20\x20\x08\x08\x08bar')
'foobar'
Sign up to request clarification or add additional context in comments.

2 Comments

@vks see re.sub and the 2nd example there. The repl argument can be a function which receives a match object and returns a string.
that i know.I have used it before.the use of lambda is somewhat confusing.Also when exactly the space has to be put?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.