MongoDB, performance of query by regular expression on indexed fields

Question

I want to find an account by name (in a MongoDB collection of 50K accounts)

In the usual way: we find with string

db.accounts.find({ name: 'Jon Skeet' })  // indexes help improve performance!

How about with regular expression? Is it an expensive operation?

db.accounts.find( { name: /Jon Skeet/ }) // worry! how indexes work with regex?

Edit:

According to WiredPrairie:
MongoDB use prefix of RegEx to lookup indexes (ex: /^prefix.*/):

db.accounts.find( { name: /^Jon Skeet/ })  // indexes will help!'

MongoDB $regex

@dirkk, I want to get more experiences and explanations. I also want to share the question too. — damphat
– damphat, Commented Jul 6, 2013 at 10:20
For regex to use an index, it must use an anchor as shown in the docs: docs.mongodb.org/manual/reference/operator/regex — WiredPrairie
– WiredPrairie, Commented Jul 6, 2013 at 11:19
There are many other very similar questions already answered on StackOverflow. — WiredPrairie
– WiredPrairie, Commented Jul 6, 2013 at 11:21
@WiredPrairie I want to focus on performance not about how to do query. — damphat
– damphat, Commented Jul 6, 2013 at 12:29

Manuel Jordan · Accepted Answer · 2021-06-11 16:37:29Z

63

Actually according to the documentation,

If an index exists for the field, then MongoDB matches the regular expression against the values in the index, which can be faster than a collection scan. Further optimization can occur if the regular expression is a “prefix expression”, which means that all potential matches start with the same string. This allows MongoDB to construct a “range” from that prefix and only match against those values from the index that fall within that range.

http://docs.mongodb.org/manual/reference/operator/query/regex/#index-use

In other words:

For /Jon Skeet/ regex ,mongo will full scan the keys in the index then will fetch the matched documents, which can be faster than collection scan.

For /^Jon Skeet/ regex ,mongo will scan only the range that start with the regex in the index, which will be faster.

edited Jun 11, 2021 at 16:37

Manuel Jordan

16.5k26 gold badges113 silver badges187 bronze badges

answered Oct 19, 2015 at 16:26

m_elsayed

8338 silver badges8 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

chovy Over a year ago

regex works fine if there is an immediate match (ie: matching the letter a). But if I match a full word results take much longer (ie: angular). This is across 6M documents, is there anyway to speed these queries up? They are taking anywhere from 19-30 seconds for 8+ characters but come back immediately with 1-2 characters.

heroin Over a year ago

@chovy, I believe MongoDB is not the best tool for searching string occurrences in the middle of text - I suggest to look at ElasticSearch or any other full-text search engines.

Sebastian · Accepted Answer · 2021-09-01 20:50:36Z

16

In case anyone still has an issue with search performance, there is a way to optimize regex search even if it searches for a word in a sentence (not necessarily at the beginning ^ or the end $ of the string).

The field should have a text index

db.someCollection.createIndex({ someField: "text" })

and the queries on should use regex only after performing a plain search first

db.someCollection.find({ $and: 
  [
    { $text: { $search: "someWord" }}, 
    { someField: { $elemMatch: {$regex: /test/ig, $regex: /other/ig}}}
  ]
})

This ensures that the regex will run only for the results of the initial, plain search, which should be quite fast thanks to the index on this field. It might have a huge impact on search performance, depending on how large the collection is.

edited Sep 1, 2021 at 20:50

answered Sep 1, 2021 at 14:15

Sebastian

3804 silver badges10 bronze badges

4 Comments

Revol89 Over a year ago

Thanks for the input. Still, I have to handle two search criteria. The whole word and then a part of the word.

FINDarkside Over a year ago

This doesn't really work if you're not searching for full words. "some" will return nothing if you search by text index.

Alex Totolici Over a year ago

any updates on this?

Waleed Ahmad Over a year ago

for anyone unable to understand logic behind it: medium.com/statuscode/…

Collectives™ on Stack Overflow

MongoDB, performance of query by regular expression on indexed fields

2 Answers 2

2 Comments

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related