5

How to count number of objects in a nested filed in elastic search? Sample mapping :

"base_keywords": {
  "type": "nested",
  "properties": {
    "base_key": {
      "type": "text"
    },
    "category": {
      "type": "text"
    },
    "created_at": {
      "type": "date"
    },
    "date": {
      "type": "date"
    },
    "rank": {
      "type": "integer"
    }
  }
}

I would like to count number of objects in nested filed 'base_keywords'.

3 Answers 3

2

You would need to do this with inline script. This is what worked for me: (Using ES 6.x):

GET your-indices/_search
{

  "aggs": {
    "whatever": {
      "sum": {
        "script": {
          "inline": "params._source.base_keywords.size()"
        }
      }
    }
  }
}
Sign up to request clarification or add additional context in comments.

Comments

1

Aggs are normally good for counting and grouping, for nested documents you can use nested aggs:

  "aggs": {
    "MyAggregation1": {
      "terms": {
        "field": "FieldA",
        "size": 0
      },
      "aggs": {
        "BaseKeyWords": {
          "nested": { "path": "base_keywords" },
          "aggs": {
            "BaseKeys": {
              "terms": {
                "field": "base_keywords.base_key.keyword",
                "size": 0
              }
            }
          }
        }
      }
    }
  }

You don't specify what you want to count, but aggs are quite flexible for grouping and counting data. The "doc_count" and "key" behave similar to an sql group by + count()

Updated (This assumes you have a .keyword field create the "keys" values, since a property of type "text" can't be aggregated or counted:

{
  "aggs": {
    "MyKeywords1Agg": {
      "nested": { "path": "keywords1" },
      "aggs": {
        "NestedKeywords": {
          "terms": {
            "field": "keywords1.keys.keyword",
            "size": 0
          }
        }
      }
    }
  }
}

For simply counting the number of nested keys you could simply do this:

{
  "aggs": {
    "MyKeywords1Agg": {
      "nested": { "path": "keywords1" }
    }
  }
}

If you want to get some grouping on the field values on the "main" document or the nested documents, you will have to extend your mapping / data model to include terms that are aggregatable, which includes most data types in elasticsearch except "text", ex.: dates, numbers, geolocations, keywords.

Edit: Example with aggregating on a unique identifier for each top level document, assuming you have a property on it called "WordMappingId" of type integer

{
  "aggs": {
    "word_maping_agg": {
      "terms": {
        "field": "WordMappingId",
        "size": 0,
        "missing": -1
      },
      "aggs": {
        "Keywords1Agg": null,
        "nested": { "path": "keywords1" }
      }
    }
  }
}

If you don't add any properties to the "word_maping" document on the top level there is no way to do an aggregation for each unique document. The builtin _id field is by default not aggregateable, and I suggest you include a unique identifier from the source data on the top level to aggregate on.

Note: the "missing" parameter will put all documents that don't have the WordMappingId property set in a bucked with the supplied value, this makes sure you're not missing any documents in the search results.

Aggs can support a behaviour similar to a group by in SQL, but you need something to actually group it by, and according to the mapping you supplied there are no such fields currently in your index.

9 Comments

I would like to count number of objects in nested filed 'base_keywords'.
I tried to run same.But got following error: { "error": { "root_cause": [ { "type": "illegal_argument_exception", "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [base_keywords.base_key] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead." } ], "type": "search_phase_execution_exception", ...... .... } } ] }, "status": 400 }
That is because the fields are of type "text", which are "analyzed", i.e. they are analyzed and stored for free text search, and cannot be used for aggregation. If you're using ES 5.x these should be available as [fieldname].keyword if you use the default mapping. .keyword fields can be used for terms aggregations. Updated the example with the assumption that there exists a base_key.keyword field.
now getting following result which contains empty fields ..... , "aggregations": { "MyAggregation1": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [] } }
@SubairKB Can you post the mapping from elasticsearch and the example query? Can be found at http://(hostname:port)/(indexname)/_mapping?pretty for the entire index or http://(hostname:port)/(indexname)/(documenttypename)/_mapping?pretty for a specific document type
|
1

I was trying to do similar to understand production data distribution The following query helped me find top 5

{
    "query": {
        "match_all": {}
    },
    "aggs": {
        "n_base_keywords": {
            "nested": { "path": "base_keywords" },
            "aggs": {
                "top_count": { "terms": { "field": "_id", "size" : 5 } }
            }
        }
    }
}

1 Comment

This one really works

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.