What is the difference between re.search and re.match?

Question

What is the difference between the search() and match() functions in the Python re module?

I've read the Python 2 documentation (Python 3 documentation), but I never seem to remember it.

The way I remember it is that "search" evokes the image in my mind of an explorer with binoculars searching off in to the distance, just like search will search to the end of the string off in the distance. — Andy Lester
– Andy Lester, Commented Nov 13, 2022 at 17:56

Vin · Accepted Answer · 2017-05-16 16:30:34Z

686

re.match is anchored at the beginning of the string. That has nothing to do with newlines, so it is not the same as using ^ in the pattern.

As the re.match documentation says:

If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding MatchObject instance. Return None if the string does not match the pattern; note that this is different from a zero-length match.

Note: If you want to locate a match anywhere in string, use search() instead.

re.search searches the entire string, as the documentation says:

Scan through string looking for a location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.

So if you need to match at the beginning of the string, or to match the entire string use match. It is faster. Otherwise use search.

The documentation has a specific section for match vs. search that also covers multiline strings:

Python offers two different primitive operations based on regular expressions: match checks for a match only at the beginning of the string, while search checks for a match anywhere in the string (this is what Perl does by default).

Note that match may differ from search even when using a regular expression beginning with '^': '^' matches only at the start of the string, or in MULTILINE mode also immediately following a newline. The “match” operation succeeds only if the pattern matches at the start of the string regardless of mode, or at the starting position given by the optional pos argument regardless of whether a newline precedes it.

Now, enough talk. Time to see some example code:

# example code:
string_with_newlines = """something
someotherthing"""

import re

print re.match('some', string_with_newlines) # matches
print re.match('someother', 
               string_with_newlines) # won't match
print re.match('^someother', string_with_newlines, 
               re.MULTILINE) # also won't match
print re.search('someother', 
                string_with_newlines) # finds something
print re.search('^someother', string_with_newlines, 
                re.MULTILINE) # also finds something

m = re.compile('thing$', re.MULTILINE)

print m.match(string_with_newlines) # no match
print m.match(string_with_newlines, pos=4) # matches
print m.search(string_with_newlines, 
               re.MULTILINE) # also matches

edited May 16, 2017 at 16:30

Vin

73910 silver badges15 bronze badges

answered Oct 8, 2008 at 0:53

nosklo

224k58 gold badges300 silver badges299 bronze badges

Sign up to request clarification or add additional context in comments.

17 Comments

Alby Over a year ago

Why would anyone use limited match rather than more general search then? is it for speed?

Ivan Bilan Over a year ago

@Alby match is much faster than search, so instead of doing regex.search("word") you can do regex.match((.*?)word(.*?)) and gain tons of performance if you are working with millions of samples.

Sammaron Over a year ago

Well, that's goofy. Why call it match? Is it a clever maneuver to seed the API's with unintuitive names to force me to read the documentation? I still won't do it! Rebel!

baptx Over a year ago

@ivan_bilan match looks a bit faster than search when using the same regular expression but your example seems wrong according to a performance test: stackoverflow.com/questions/180986/…

Zitao Wang Over a year ago

When using a regular expression beginning with '^', and with MULTILINE unspecified, is match the same as search (produce the same result)?

|

Ray Toal · Accepted Answer · 2013-01-19 02:49:49Z

146

search ⇒ find something anywhere in the string and return a match object.

match ⇒ find something at the beginning of the string and return a match object.

edited Jan 19, 2013 at 2:49

Ray Toal

88.7k20 gold badges186 silver badges245 bronze badges

answered Dec 31, 2011 at 12:05

Dhanasekaran Anbalagan

2,7542 gold badges17 silver badges12 bronze badges

Comments

Jeyekomon · Accepted Answer · 2022-02-23 06:02:17Z

110

match is much faster than search, so instead of doing regex.search("word") you can do regex.match((.*?)word(.*?)) and gain tons of performance if you are working with millions of samples.

This comment from @ivan_bilan under the accepted answer above got me thinking if such hack is actually speeding anything up, so let's find out how many tons of performance you will really gain.

I prepared the following test suite:

import random
import re
import string
import time

LENGTH = 10
LIST_SIZE = 1000000

def generate_word():
    word = [random.choice(string.ascii_lowercase) for _ in range(LENGTH)]
    word = ''.join(word)
    return word

wordlist = [generate_word() for _ in range(LIST_SIZE)]

start = time.time()
[re.search('python', word) for word in wordlist]
print('search:', time.time() - start)

start = time.time()
[re.match('(.*?)python(.*?)', word) for word in wordlist]
print('match:', time.time() - start)

I made 10 measurements (1M, 2M, ..., 10M words) which gave me the following plot:

As you can see, searching for the pattern 'python' is faster than matching the pattern '(.*?)python(.*?)'.

Python is smart. Avoid trying to be smarter.

edited Feb 23, 2022 at 6:02

answered Apr 7, 2018 at 19:03

Jeyekomon

3,5763 gold badges33 silver badges43 bronze badges

10 Comments

Robert Dodier Over a year ago

+1 for actually investigating the assumptions behind a statement meant to be taken at face value -- thanks.

baptx Over a year ago

Indeed the comment of @ivan_bilan looks wrong but the match function is still faster than the search function if you compare the same regular expression. You can check in your script by comparing re.search('^python', word) to re.match('python', word) (or re.match('^python', word) which is the same but easier to understand if you don't read the documentation and seems not to affect the performance)

Jeyekomon Over a year ago

@baptx I disagree with the statement that the match function is generally faster. The match is faster when you want to search at the beginning of the string, the search is faster when you want to search throughout the string. Which corresponds with the common sense. That's why @ivan_bilan was wrong - he used match to search throughout the string. That's why you are right - you used match to search at the beginning of the string. If you disagree with me, try to find regex for match that is faster than re.search('python', word) and does the same job.

Jeyekomon Over a year ago

@baptx Also, as a footnote, the re.match('python') is marginally faster than re.match('^python'). It has to be.

baptx Over a year ago

@Jeyekomon yes that's what I meant, match function is a bit faster if you want to search at the beginning of a string (compared to using search function to find a word at the beginning of a string with re.search('^python', word) for example). But I find this weird, if you tell the search function to search at the beginning of a string, it should be as fast as the match function.

|

tzot · Accepted Answer · 2008-10-22 13:48:08Z

59

re.search searches for the pattern throughout the string, whereas re.match does not search the pattern; if it does not, it has no other choice than to match it at start of the string.

edited Oct 22, 2008 at 13:48

tzot

96.7k30 gold badges151 silver badges211 bronze badges

answered Oct 8, 2008 at 1:07

xilun

6314 silver badges2 bronze badges

1 Comment

Smit Johnth Over a year ago

Why match at start, but not till end of string (fullmatch in phyton 3.4)?

CODE-REaD · Accepted Answer · 2016-05-21 13:28:47Z

42

The difference is, re.match() misleads anyone accustomed to Perl, grep, or sed regular expression matching, and re.search() does not. :-)

More soberly, As John D. Cook remarks, re.match() "behaves as if every pattern has ^ prepended." In other words, re.match('pattern') equals re.search('^pattern'). So it anchors a pattern's left side. But it also doesn't anchor a pattern's right side: that still requires a terminating $.

Frankly given the above, I think re.match() should be deprecated. I would be interested to know reasons it should be retained.

answered May 21, 2016 at 13:28

CODE-REaD

3,1184 gold badges36 silver badges61 bronze badges

1 Comment

JoelFan Over a year ago

"behaves as if every pattern has ^ prepended." is only true if you don't use the multiline option. The correct statement is "... has \A prepended"

Community · Accepted Answer · 2019-08-19 14:11:36Z

40

You can refer the below example to understand the working of re.match and re.search

a = "123abc"
t = re.match("[a-z]+",a)
t = re.search("[a-z]+",a)

re.match will return none, but re.search will return abc.

edited Aug 19, 2019 at 14:11

CommunityBot

11 silver badge

answered Jul 30, 2015 at 5:27

ldR

4094 silver badges2 bronze badges

1 Comment

SanD Over a year ago

Would just like to add that search will return _sre.SRE_Match object (or None if not found). To get 'abc', you need to call t.group()

Cabbage soup · Accepted Answer · 2020-05-14 09:19:27Z

23

Much shorter:

search scans through the whole string.
match scans only the beginning of the string.

Following Ex says it:

>>> a = "123abc"
>>> re.match("[a-z]+",a)
None
>>> re.search("[a-z]+",a)
abc

edited May 14, 2020 at 9:19

Cabbage soup

1,3941 gold badge18 silver badges27 bronze badges

answered Oct 31, 2018 at 0:22

U13-Forward

71.8k15 gold badges100 silver badges125 bronze badges

1 Comment

noidentity63 Over a year ago

Even with most of the examples posted here, I am having a hard time seeing the description 'beginning of the string' as an accurate statement. I don't know, it just seems arbitrary. How do I know where the beginning of the string 'ends'?? Is it via a newline? because based from the example here, 'beginning' simply means the very first character '1'.

cschol · Accepted Answer · 2008-10-08 00:54:57Z

20

re.match attempts to match a pattern at the beginning of the string. re.search attempts to match the pattern throughout the string until it finds a match.

answered Oct 8, 2008 at 0:54

cschol

13.1k12 gold badges69 silver badges80 bronze badges

Comments

Pall Arpad · Accepted Answer · 2022-06-20 07:32:02Z

1

Quick answer

re.search('test', ' test')      # returns a Truthy match object (because the search starts from any index) 

re.match('test', ' test')       # returns None (because the search start from 0 index)
re.match('test', 'test')        # returns a Truthy match object (match at 0 index)

answered Jun 20, 2022 at 7:32

Pall Arpad

1,9171 gold badge18 silver badges22 bronze badges

Comments

cottontail · Accepted Answer · 2023-06-05 06:38:08Z

re.match is anchored at the beginning of a string, while re.search scans through the entire string. So in the following example, x and y match the same thing.

x = re.match('pat', s)       # <--- already anchored at the beginning of string
y = re.search('\Apat', s)    # <--- match at the beginning

If a string doesn't contain line breaks, \A and ^ are essentially the same; the difference shows up in multiline strings. In the following example, re.match will never match the second line, while re.search can with the correct regex (and flag).

s = "1\n2"
re.match('2', s, re.M)       # no match
re.search('^2', s, re.M)     # match
re.search('\A2', s, re.M)    # no match  <--- mimics `re.match`

There's another function in re, re.fullmatch() that scans the entire string, so it is anchored both at the beginning and the end of a string. So in the following example, x, y and z match the same thing.

x = re.match('pat\Z', s)     # <--- already anchored at the beginning; must match end
y = re.search('\Apat\Z', s)  # <--- match at the beginning and end of string
z = re.fullmatch('pat', s)   # <--- already anchored at the beginning and end

Based on Jeyekomon's answer (and using their setup), using the perfplot library, I plotted the results of timeit tests that looks into:

how do they compare if re.search "mimics" re.match? (first plot)
how do they compare if re.match "mimics" re.search? (second plot)
how do they compare if the same pattern is passed to them? (last plot)

Note that the last pattern doesn't produce the same output (because re.match is anchored at the beginning of a string.)

The first plot shows match is faster if search is used like match. The second plot supports @Jeyekomon's answer and shows search is faster if match is used like search. The last plot shows there's very little difference between the two if they scan for the same pattern.

Code used to produce the performance plot.

import re
from random import choices
from string import ascii_lowercase
import matplotlib.pyplot as plt
from perfplot import plot

patterns = [
    [re.compile(r'\Aword'), re.compile(r'word')],
    [re.compile(r'word'), re.compile(r'(.*?)word')],
    [re.compile(r'word')]*2
]

fig, axs = plt.subplots(1, 3, figsize=(20,6), facecolor='white')
for i, (pat1, pat2) in enumerate(patterns):
    plt.sca(axs[i])
    perfplot.plot(
        setup=lambda n: [''.join(choices(ascii_lowercase, k=10)) for _ in range(n)],
        kernels=[lambda lst: [*map(pat1.search, lst)], lambda lst: [*map(pat2.match, lst)]],
        labels= [f"re.search(r'{pat1.pattern}', w)", f"re.match(r'{pat2.pattern}', w)"],
        n_range=[2**k for k in range(24)],
        xlabel='Length of list',
        equality_check=None
    )
fig.suptitle('re.match vs re.search')
fig.tight_layout();

Collectives™ on Stack Overflow

What is the difference between re.search and re.match?

10 Answers 10

17 Comments

Comments

10 Comments

1 Comment

1 Comment

1 Comment

1 Comment

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

10 Answers 10

17 Comments

Comments

10 Comments

1 Comment

1 Comment

1 Comment

1 Comment

Comments

Comments

Comments

Linked

Related