6

I have elastic cluster with hundreds of indices. Is there any way to list (search) indices using boolean query? e.g.

( index.alias:*read_index* AND doc.count:<1000 ) OR ( index.name* ) OR (index.size:<2gb) OR (index.replica:>2)

I need to filter out required indices from the list of hundreds of indices.

Kindly suggest.

6
  • Don't think this is possible unless you create a new index containing this information and searching that. Commented Jul 21, 2018 at 6:08
  • Using a clever mix of the _cat APIs + jq + awk you should be able to achieve what you want. Commented Jul 21, 2018 at 6:56
  • What is jq and awk? Commented Jul 21, 2018 at 7:05
  • 1
    @MohammadShahid are you using X-Pack Monitoring? If so, it would be easy to run that query on the .monitoring indices that do keep info about the monitored indices. Commented Jul 25, 2018 at 5:45
  • @AndreiStefan I am making a gui tool to manage elasticsearch indices. Commented Jul 25, 2018 at 12:47

1 Answer 1

2

Using plain elasticsearch bool queries :), just store the JSON format cat output into an index, then make the queries you need, automatize the collection with a cronjob to gather this every X time, my python script looks like this:

# install dependencies: pip install requests
import requests
import json

ES_URL = "http://localhost:9200"

res = requests.get("{}{}".format(ES_URL, "/_cat/indices"),
                   params={"format": "json", "bytes": "m"})

for index_info in res.json():
    index_url = "{}/{}/{}/{}".format(
        ES_URL, "cat_to_index", "doc", index_info["index"]
    )

    requests.post(
        index_url,
        data=json.dumps(index_info),
        headers={'Content-type': 'application/json'}
    )

# ready to query http://localhost:9200/cat_to_index/_search
# ready to keep up-to-date with a cronjob, as the index name is the ID new values will be overwritten.

hope it helps.

Sign up to request clarification or add additional context in comments.

3 Comments

This is not realtime, I will have to index data first before using it. In my case, I have to hit the search API every second and data must be correct every time (e.g. doc count etc)
is not possible directly with cat API, the only option you have is to use the res = requests.get line and make an in-memory search directly from the script.
this is not the only option, the other one by @AndreiStefan is also viable, i.e. use XPack Monitoring and query the .monitoring indices.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.