8

So I have a field that stores a value in the format: number/year, something like 23/2014, 24/2014, 12/2015, etc...

so if this field is mapped as a not_analyzed one, I can make exact value searches with term filter, if I search for a value in that exact structure(something like 1/2014, 15/2014,...) it works, like the sql equals(=).

{
  "query": {
    "filtered": {
      "filter": {
        "term": {
          "processNumber": "11/2014"
        }
      }
    }
  }
}

So, searching with something different like 11/, or /2014 wouldn't return hits. This is fine.

But if I define the field as not_analyzed, I can't make sql LIKE type searches with the match_phrase query.

{
  "query": {
    "match_phrase": {
      "processNumber": "11/201"
    }
  }
}

In this case searching for 11,11/,/2014 or 2014 should return hits, but they don't. The thing is, this query works if the field is not mapped as a not_analyzed one. So it seems I have to either use one or the other, the problem is that the field should support both options for different queries, am I missing something here?

1 Answer 1

14

You can analyze the same field processNumber in different ways using the fields property in the mapping:

For example if you want the analyzed and unanalyzed version of ProcessNumber the mapping would be :

 {
   "type_name": {
      "properties": {
         "processNumber": {
            "type": "string",
            "index": "not_analyzed",
            "fields": {
               "analyzed": {
                  "type": "string",
                  "index": "analyzed"
               }
            }
         }
      }
   }
}

Where the not-analyzed field is referred in query as processNumber .

To refer to the analyzed view of the field use processNumber.analyzed

The queries for terms 11/201, 11 etc would be :

Example Filter:

 { "query" : { "filtered" : { "filter" : { "term" : { "processNumber" : "11/2014" } } } } }

Term filter it does not analyze the search string so an input would be matched as it is with the fields inverted index in this case : 11/2014 against the field.

Example Match_Phrase_prefix:

{ "query": { "match_phrase_prefix": { "processNumber": "11/201" } } }

match_phrase_prefix tries to check if the last term in the phrase is a prefix of terms in index . It analyzes the search string if an analyzer is specified. This is the reason you need to use the unanalyzed version of the field here . If we use processNumber.analyzed search queries such as 11-201 , 11|201 would also match

example match :

  { "query": { "match": { "processNumber.analyzed": "11" } } }

This is straight forward match since default analyzer (usually standard analyzer) will tokenize 11/2014 to terms 11, 2014 .

You can use the analyze api to see how a particular text gets analyzed by default analyzer.

curl -XPOST "http://<machine>/_analyze?text=11/2014"
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks. I'm currently using the spring integration with elasticsearch and here's how the field is mapped currently: @Field(type = String,index = FieldIndex.not_analyzed) public String processNumber; I don't know if there's a way to set the fields property here.
Ok i think i got this to work: @MultiField( mainField =@Field(type=String , index =FieldIndex.not_analyzed), otherFields = @NestedField(dotSuffix = "analyzed" , type = String,index=FieldIndex.analyzed) ) But can you tell me the difference between all three match queries?
@Maxrunner edited the answer to give a brief explanation of the queries.
Ony last quick question, while using the term filter against a field that has the string "Master", searching for "master" works, but not "Master", any reason?
Check your answer for my edit regarding the match_phrase_prefix.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.