0

I have this mapping in ES 7.9:

{
  "mappings": {
    "properties": {
      "cid": {
        "type": "keyword",
        "store": true
      },
      "id": {
        "type": "keyword",
        "store": true
      },
      "a": {
        "type": "nested",
        "properties": {
          "attribute":{
            "type": "keyword"      
          },  
          "key": {
            "type": "keyword"
          },
          "num": {
            "type": "float"
          }
        }
      }
    }
  }
}

And some documents indexed like:


{
    "cid": "177",
    "id": "1",
    "a": [
        {
            "attribute": "tags",
            "key": [
                "heel",
                "thong",
                "low_heel",
                "economic"
            ]
        },
        {
            "attribute": "weight",
            "num": 15
        }
    ]
}

Basically, an object can have multiple attributes (a property array).

Those attributes can be different for each client. In this example, I have 2 types of attributes: tag and weight, however other documents could have other attributes like vendor, size, power, etc., so the model has to be generic enough to support beforehand unknown attributes.

An attribute can be a list of keywords (like tags) or a numeric value (like weight).

I need an ES query to fetch the documents ids with this pseudo-query:

cid="177" and (tag="flat" or tag="heel") and tag="economic" and weight<20

I managed to reach this query that seems to be working as expected:

{
    "_source": ["id"],
    "query": {
        "bool": {
            "must" : [
                {"term" : { "cid" : "177" }},
                {
                    "nested": {
                        "path": "a",
                        "query": {
                            "bool":{
                                "must":[
                                    {"term" : { "a.attribute": "tags"}},
                                    {"terms" : { "a.key": ["flat","heel"]}}       
                                ]
                            }
                        }
                    }
                },
                {
                    "nested": {
                        "path": "a",
                        "query": {
                            "bool":{
                                "must":[
                                    {"term" : { "a.attribute": "tags"}},
                                    {"term" : { "a.key": "economic"}}       
                                ]
                            }
                        }
                    }
                },
                {
                    "nested": {
                        "path": "a",
                        "query": {
                            "bool":{
                                "must":[
                                    {"term" : { "a.attribute": "weight" } },
                                    {"range": { "a.num": {"lt": 20} } }
                                ]
                            }
                        }
                    }
                }                 
            ]            
        }
    }
}
  1. Is this query correct or I am getting the correct results by chance?
  2. Is the query (or mapping) optimal or I should rethink something?
  3. Can the query be simplified?

1 Answer 1

1
  1. The query is correct.
  2. The mapping is great and the query is optimal.
  3. While the query can be simplified:
{
  "_source": [
    "id"
  ],
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "cid": "177"
          }
        },
        {
          "nested": {
            "path": "a",
            "query": {
              "query_string": {
                "query": "a.attribute:tags AND ((a.key:flat OR a.key:heel) AND a.key:economic)"
              }
            }
          }
        },
        {
          "nested": {
            "path": "a",
            "query": {
              "query_string": {
                "query": "a.attribute:weight AND a.num:<20"
              }
            }
          }
        }
      ]
    }
  }
}

it'd be less optimal due to the fact that these query_strings would still need to be internally compiled into essentially the query DSL that you've got above. Plus you'd still be needing the two separate nested groups so... You're good to roll with what you've got.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.