1

I have a field with mapping :

{
"type" : "text",
    "fields" : {
      "keyword" : {
        "type" : "keyword",
        "ignore_above" : 256
      }
    }
}

One of the document has value for the above field as "abcdef". What kind of ES query should be used to match this document when searching for "def"?

I have tried match, prefix queries. ES version : 5.1.1

1
  • Could you post the search requests you've tried, please. Commented Jul 18, 2019 at 12:43

1 Answer 1

1

You can create a custom analyzer which uses the n-gram analyzer and uses it on your field on which you want the substring search, wildcard searches are quite costly and I guess that's the reason you don't want to use them as mentioned in your this duplicate SO question.

My Index setting and mapping according to your requirement.

{
    "settings": {
        "analysis": {
            "analyzer": {
                "my_analyzer": {
                    "tokenizer": "my_tokenizer"
                }
            },
            "tokenizer": {
                "my_tokenizer": {
                    "type": "ngram",
                    "min_gram": 3,
                    "max_gram": 3,
                    "token_chars": [
                        "letter",
                        "digit"
                    ]
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "foo": {
                "type": "text",
                "fields": {
                    "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                    }
                },
                "analyzer": "my_analyzer"
            }
        }
    }
}

I have created a field called foo and used my custom n-gram analyzer on that field, so for value abcdef it would create a below tokens.

{
    "tokens": [
        {
            "token": "abc",
            "start_offset": 0,
            "end_offset": 3,
            "type": "word",
            "position": 0
        },
        {
            "token": "bcd",
            "start_offset": 1,
            "end_offset": 4,
            "type": "word",
            "position": 1
        },
        {
            "token": "cde",
            "start_offset": 2,
            "end_offset": 5,
            "type": "word",
            "position": 2
        },
        {
            "token": "def",
            "start_offset": 3,
            "end_offset": 6,
            "type": "word",
            "position": 3
        }
    ]
}

And then below search query returns me the doc containing abcdef.

{
    "query": {
        "term" : {
            "foo" : "def"
        }
    }
}

EDIT: My postman collection link if you want to check all the API calls., Just replace it with you es port and index.

Sign up to request clarification or add additional context in comments.

3 Comments

Ok, so in n-gram analyser, we can configure tokens of what length should be built ? But this is very good info.
@User3518958, yes of course, you can see "min_gram": 3 and "max_gram": 3, attribute in my mapping and more information in the link I provided for n-gram.
@User3518958 can you please provide update on whether I was able to answer ur question or not?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.