0

I'm trying to parse some text using regex and would like a combination of strings to only register as one match if the combination appears, but for any substring to be captured if only that substring appears. For example, I want either foo bar or either individual string, such that I get:

text = 'foo bar bar foo'
In: re.findall(some_pattern, text)
Out: ['foo bar', 'bar', 'foo']

Using some_pattern = re.compile(r'foo|bar) returns ['foo', 'bar', 'bar', 'foo']. But I can't begin to think of any other patterns that would make this work. How can I capture this?

3 Answers 3

2

You can use multiple |s:

import re
print(re.findall('foo bar|foo|bar', 'foo bar bar foo'))

Output:

['foo bar', 'bar', 'foo']
Sign up to request clarification or add additional context in comments.

Comments

1

Another way to do it: foo(?: bar)?|bar

Comments

1

Could be used like this as well-

import re print(re.findall('((foo)\s?(bar))', 'foo bar bar foo'))

Output:

['foo bar', 'bar', 'foo']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.