Elasticsearch - count per index for one query

Question

In an Elasticsearch cluster I have about 30 indices with the same structure.

I need to find out which of the indices would return at least 1 result for my query.

The result itself does not matter. I will make the business logic decisions based on the name of the index, that contains at least 1 document that satisfies the search criteria.

The search might return from 0 up to ~10 000 000 hits over all indices depending on the input. The search will be performed ~50 000 times with the different input.

I see the following solutions:

Use the search API with scrolling and look at all results to find out from which index they are. This is what is currently implemented and I'm looking for a faster solution.
Use the count API and do a count for every index. This will lead to more requests. Might this be faster?
Is there another possibility/API available?

and try with _search?size=0 instead of search_type=count — A l w a y s S u n n y
– A l w a y s S u n n y, Commented Apr 12, 2020 at 5:20

D. Schmidt · Accepted Answer · 2020-04-18 02:02:04Z

3

I would use a terms bucket aggregation (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html) over the _index metadata field. Then, I would know what index has more than 1 hit.

E.g.,

{
  "query": { your_query },
  "aggs": {
    "group_by_index": {
      "terms": {
        "field": "_index",
        "size": "30"
      }
    }
  }
}

edited Apr 18, 2020 at 2:02

D. Schmidt

1671 silver badge7 bronze badges

answered Apr 11, 2020 at 20:46

glenacota

2,5571 gold badge14 silver badges19 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

D. Schmidt Over a year ago

I just had to add the size of indices that I query. Otherwise I did miss some indices.

pkp9999 · Accepted Answer · 2020-04-11 22:10:11Z

0

I would use the aggs like @glenacota mentioned. In addition, you can run that over multiple indices or against an alias pointing to all your 30 indices like

GET my_index_1, another_index_*/_search?size=0

Though, I will also recommend to profile the query and see how it would fare against your cluster considering that you are looking at large number of indices, their document count and # of requests.

answered Apr 11, 2020 at 22:10

pkp9999

1594 bronze badges

Collectives™ on Stack Overflow

Elasticsearch - count per index for one query

2 Answers 2

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related