9

My Elasticsearch (v5.4.1) documents have a _patents field as such :

{
    // (Other fields : title, text, date, etc.)
    ,
    "_patents": [
        {"cc": "US"},
        {"cc": "MX"},
        {"cc": "KR"},
        {"cc": "JP"},
        {"cc": "CN"},
        {"cc": "CA"},
        {"cc": "AU"},
        {"cc": "AR"}
    ]
}

I'm trying to build a query that would return only documents whose patents match an array of country codes. For instance, if my filter is ["US","AU"] I need to be returned all documents that have patents in US and in AU. Exclude documents that have US but not AU.

So far I have tried to add a "term" field to my current working query :

{
    "query": {
        "bool": {
            "must": [
                // (Other conditions here : title match, text match, date range, etc.) These work
                 ,
                {
                    "terms": {
                        "_patents.cc": [ // I tried just "_patents"
                            "US",
                            "AU"
                        ]
                    }
                }
            ]
        }
    }
}

Or this, as a filter :

{
    "query": {
        "bool": {
            "must": [...],
            "filter": {
                "terms": {
                    "_patents": [
                        "US",
                        "AU"
                    ]
                }
            }
        }
    }
}

These queries and the variants I've tried don't produce an error, but return 0 result.

I can't change my ES document model to something easier to match, like "_patents": [ "US","CA", "AU", "CN", "JP" ] because this is a populated field. At indexation time, I populate and reference Patent documents that have many fields, including cc.

3 Answers 3

13

I found the solution. The filtered country names have to be lowercase...

"US" returns no result, but "us" works, despite the indexed field being "US" ...... Faint -_-'

I also wrote the query this way :

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "_patents.cc": "us"
          }
        },
        {
          "term": {
            "_patents.cc": "ca"
          }
        }
      ]
    }
  }
}  
Sign up to request clarification or add additional context in comments.

1 Comment

I couldn't figure out why querying against an array of ints was working fine but with an array of strings it returned 0 results. This seems to be true when using "term/terms" but not when using "query." I guess it makes sense to facilitate exact matches but why not transform the query then? I'm missing something, obviously.
8

This works for Uppercase and lowercase both..

 {
  "query": {
    "bool": {
      "must": [ 
        {
          "match": {
            "_patents.cc": "au"
          }
        },
        {
          "match": {
            "_patents.cc": "us"
          }
        }
      ]
    }
  }
}

2 Comments

Cool, that's right, thanks :) I didn't know that "term" worked only with lowercase.
This worked for me, thanks. Do you know if there's a "cleaner" way of doing it where we don't have to repeat the "match" clause ?
6

My version of elasticsearch Version is 6.0.1. I am using this approach:

GET <your index>/_search
{
  "query": {
    "bool": {
      "must": [{
        "query_string": {
          "query": "cc:us OR cc:ca"
        }
      }]
    }    
  }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.