5

I have a document that has the following Schema

{
  description : String,
  tags : [String]
}

I have indexed both fields as text, but the problem is that whenever I search for a specific string within the array, it will return the document only if the string is the first element of the array. Therefore it seems that the $text index only works for the first element, is this how mongo inherently works or is there an option that must be passed to the index?

Example document

{
   description : 'random description',
   tags : ["hello", "there"]
}

The object that created the index

{description : 'text', tags : 'text'}

The query

db.myCollection.find({$text : {$search : 'hello'}});

returns a document but

db.myCollection.find({$text : {$search : 'there'}});

does not return anything.

using version 2.6.11

I have other indexes but these are the only text indexes. Here is the corresponding output of db.myCollection.getIndexes()

{
                "v" : 1,
                "key" : {
                        "_fts" : "text",
                        "_ftsx" : 1
                },
                "name" : "description_text_tags_text",
                "ns" : "myDB.myCollection",
                "weights" : {
                        "description" : 1,
                        "tags" : 1
                },
                "default_language" : "english",
                "language_override" : "language",
                "textIndexVersion" : 2
        },
2
  • Do you have an example of a document and query? What version of MongoDB? Commented Dec 28, 2015 at 2:33
  • Added additional details. Commented Dec 28, 2015 at 2:38

1 Answer 1

3

This has nothing to do with the string being first element or second element of the array. The word "there" is in the stop-words list of "english" language and is not added to the index at all. The text indexing process involves stemming and removal of the stop words from the text, before the terms gets added to the text index and these processes are language dependent.

You may like to create the text index as:

db.myCollection.ensureIndex({description : 'text', tags : 'text'}, { default_language: "none" }) 

If "none" is used as the default language, then text indexing process will do simple tokenization and will not use any stop words list. By default, "english" is used as the "default_language" for the text index.

Sign up to request clarification or add additional context in comments.

2 Comments

Note that you must drop the index before recreating it. You can also specify the language for a query as $language property of $text
Great catch, I was using those terms as placeholders, and I doubt the content within the final app will ever do so, but I sure learned something.. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.