0

For simplicity suppose I have index from 3 rows in elastic:

{"id": 1, "tags": ["t1", "t2", "t3"]}, 
{"id": 2, "tags": ["t1", "t4", "t5"]}

I need to aggregate by some tags without returning result of other tags in matching documents:

{
  "aggs": {
    "tags": {
      "terms": {"field": "tags"}
    }
  },
  "query": {
    "bool": {
      "filter": [
        {
          "terms": {"tags": ["t1", "t2"]}
        }
      ]
    }
  }
}

# RESULT
{
    "aggregations": {
        "tags": {
            "buckets": [
                {"doc_count": 2, "key": "t1"},
                {"doc_count": 1, "key": "t2"},
                {"doc_count": 1, "key": "t3"},  # should be removed by filter
                {"doc_count": 1, "key": "t4"},  # should be removed by filter
                {"doc_count": 1, "key": "t5"},  # should be removed by filter
            ],
        }
    },
    "hits": {
        "hits": [],
        "max_score": 0.0,
        "total": 2
    },
}

How to (maybe) postfilter this result?

Because in case of 3 rows in index this only 3 extra items (t3, t4, t5). But in real situation I have more than 200K rows in index and it's horrible! I need aggregate by 50 tags, but I get result with more than 1K tags.

1 Answer 1

1

Assuming that your version of Elasticsearch supports it, I should use the "include" attribute to the term aggregation. Your query should be as above:

POST /test/_search
{
  "aggs": {
    "tags": {
      "terms": {"field": "tags",  "include": ["t1", "t2"]}
    }
  },
  "query": {
    "bool": {
      "filter": [
        {
          "terms": {"tags": ["t1", "t2"]}
        }
      ]
    }
  }
}

```

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.