2

For a doc indexed at Elasticsearch like:

{
  "a": [
    {
      "b": [1,2,3],
      "c": "abcd1"
    },
    {
      "b": [4,5,6,7],
      "c": "abcd2"
    }
  ]
}

Can we apply source filtering such that the query returns only b nodes from all object(s) in a?

I've tried something like this:

{
  "_source": {
    "excludes": [
      "a[*].c"
    ]
  },
  "query": {
    "match_all": {}
  }
}

But, it didn't work.

1

1 Answer 1

1

Since "a" is an array of objects to accomplish what you want, you need to define "a" as a Nested datatype. Please read "Array of Objects" note here https://www.elastic.co/guide/en/elasticsearch/reference/current/array.html

So you have to define "a" property as nested type in the mapping. I'm following the next steps from your example:

1.- Define the mapping

curl -XPUT 'localhost:9200/my_index?pretty' -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "_doc": {
      "properties": {
        "a": {
          "type": "nested" 
        }
      }
    }
  }
}
'

2.- Create document 1 with your sample data:

curl -XPUT 'localhost:9200/my_index/_doc/1?pretty' -H 'Content-Type: application/json' -d'
{
  "a" : [
    {
      "b" : [1,2,3],
      "c" : "abcd1"
    },
    {
      "b" : [4,5,6,7],
      "c" :  "abcd2"
    }
  ]
}
'

3.- And here is how you query should be, please notice nested.path when you have to specify the path to where you really want to start the query, and then the normal query

curl -XGET 'localhost:9200/my_index/_search?pretty' -H 'Content-Type: application/json' -d'
{
  "_source": "a.b",
  "query": {
    "nested": {
      "path": "a",
      "query": {
        "match_all": {}
      }
    }
  }
}
'

And this is the result with only b field in each object:

"took" : 4,
"timed_out" : false,
"_shards" : {
  "total" : 5,
  "successful" : 5,
  "skipped" : 0,
  "failed" : 0
},
"hits" : {
  "total" : 1,
  "max_score" : 1.0,
  "hits" : [
    {
      "_index" : "my_index",
      "_type" : "_doc",
      "_id" : "1",
      "_score" : 1.0,
      "_source" : {
        "a" : [
          {
            "b" : [1, 2, 3]
          },
          {
            "b" : [4, 5, 6, 7]
          }
        ]
      }
    }
  ]
}

Here the ElasticSearch reference for Nested date types https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html

Sign up to request clarification or add additional context in comments.

1 Comment

If a[...] is response from aggregation query (i.e. avg aggs inside terms aggs) and I want only b (exclude c) then how to do source filtering?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.