22

I've got a list of exact patterns that I want to search in a given string. Currently I've got a real bad solution for such a problem.

pat1 = re.compile('foo.tralingString')
mat1 = pat1.match(mystring)

pat2 = re.compile('bar.trailingString')
mat2 = pat2.match(mystring)

if mat1 or mat2:
    # Do whatever

pat = re.compile('[foo|bar].tralingString')
match = pat.match(mystring) # Doesn't work

The only condition is that I've got a list of strings which are to be matched exactly. Whats the best possible solution in Python.

EDIT: The search patterns have some trailing patterns common.

5 Answers 5

32

You could do a trivial regex that combines those two:

pat = re.compile('foo|bar')
if pat.match(mystring):
    # Do whatever

You could then expand the regex to do whatever you need to, using the | separator (which means or in regex syntax)

Edit: Based upon your recent edit, this should do it for you:

pat = re.compile('(foo|bar)\\.trailingString');
if pat.match(mystring):
    # Do Whatever

The [] is a character class. So your [foo|bar] would match a string with one of the included characters (since there's no * or + or ? after the class). () is the enclosure for a sub-pattern.

Sign up to request clarification or add additional context in comments.

5 Comments

Actually the problem is a bit more complicated. My search patterns are like 1. foo.trailingString 2. bar.trailingString. I tried to do [foo|bar].trailingString, but that fails.
@Neo: that changes the question, doesn't it. try (foo|bar).trailingString (although I'm not 100% sure of Python's regex syntax)...
@ircmaxell: Python has PCRE-like syntax with just a few little differences I think.
Neo - just use foo. and bar. in your regex expression (which should be escaped). Check my answer as well.
@ircmaxell: You need to escape the .
10

You're right in using | but you're using a character class [] instead of a subpattern (). Try this regex:

r = re.compile('(?:foo|bar)\.trailingString')

if r.match(mystring):
    # Do stuff

Old answer

If you want to do exact substring matches you shouldn't use regex.

Try using in instead:

words = ['foo', 'bar']

# mystring contains at least one of the words
if any(i in mystring for i in words):
    # Do stuff

1 Comment

Please have a look at the edit. All the search patterns have some common trailing parts. So I was hoping to use Re somehow.
2

Do you want to search for patterns or strings? The best solution for each is very different:

# strings
patterns = ['foo', 'bar', 'baz']
matches = set(patterns)

if mystring in matches:     # O(1) - very fast
    # do whatever


# patterns
import re
patterns = ['foo', 'bar']
matches = [re.compile(pat) for pat in patterns]

if any(m.match(mystring) for m in matches):    # O(n)
    # do whatever

Edit: Ok, you want to search on variable-length exact strings at the beginning of a search string; try

from collections import defaultdict
matches = defaultdict(set)

patterns = ['foo', 'barr', 'bazzz']
for p in patterns:
    matches[len(p)].add(p)

for strlen,pats in matches.iteritems():
    if mystring[:strlen] in pats:
        # do whatever
        break

Comments

2

Use '|' in your regex. It stands for 'OR'. There is better way too, when you want to re.escape your strings

pat = re.compile('|'.join(map(re.escape, ['foo.tralingString','bar.tralingString','something.else'])))

Comments

0

perhaps

any([re.match(r, mystring) for r in ['bar', 'foo']])

I'm assuming your match patterns will be more complex than foo or bar; if they aren't, just use

if mystring in ['bar', 'foo']:

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.