4

I'm trying to improve query performance. It takes an average of about 3 seconds for simple queries which don't even touch a nested document, and it's sometimes longer.

curl "http://searchbox:9200/global/user/_search?n=0&sort=influence:asc&q=user.name:Bill%20Smith"

Even without the sort it takes seconds. Here are the details of the cluster:

1.4TB index size.
210m documents that aren't nested (About 10kb each)
500m documents in total. (nested documents are small: 2-5 fields).
About 128 segments per node.
3 nodes, m2.4xlarge (-Xmx set to 40g, machine memory is 60g)
3 shards.
Index is on amazon EBS volumes.
Replication 0 (have tried replication 2 with only little improvement)

I don't see any noticeable spikes in CPU/memory etc. Any ideas how this could be improved?

2 Answers 2

5

Garry's points about heap space are true, but it's probably not heap space that's the issue here.

With your current configuration, you'll have less than 60GB of page cache available, for a 1.5 TB index. With less than 4.2% of your index in page cache, there's a high probability you'll be needing to hit disk for most of your searches.

You probably want to add more memory to your cluster, and you'll want to think carefully about the number of shards as well. Just sticking to the default can cause skewed distribution. If you had five shards in this case, you'd have two machines with 40% of the data each, and a third with just 20%. In either case, you'll always be waiting for the slowest machine or disk when doing distributed searches. This article on Elasticsearch in Production goes a bit more in depth on determining the right amount of memory.

For this exact search example, you can probably use filters, though. You're sorting, thus ignoring the score calculated by the query. With a filter, it'll be cached after the first run, and subsequent searches will be quick.

Sign up to request clarification or add additional context in comments.

9 Comments

Thanks for this. I've changed some of our source to use filters, but still they're not quick enough. The strange thing is that we had a cluster with a similar number of documents (minus the nested ones), but with far fewer fields and queries were much faster (ms).
This was on half as much hardware, and the index was still too big to fit in RAM. I'm currently reindexing the data with index: no store: false on a whole load of fields. Any idea if that could help?
Number of fields isn't really the issue, it's whether the necessary index pages necessary to answer the majority of your searches are in the page cache. While it's nice to bring the index size down, cutting away data that weren't really used anyway will not improve things a lot.
Removing fields wont help no, however, if you have all of your fields set to "stored=true" then they will form part of the index, which resides in RAM. All fields are stored as part of the document on disk, just make sure you don't have unnecessary fields set to be stored as part of the index.
By default, Elasticsearch stores the entire original document as _source, and no other fields. Adding more fields will obviously make the on-disk "result object" that is fetched for the returned hits larger, but these things don't reside persistently in RAM.
|
2

Ok, a few things here:

  1. Decrease your heap size, you have a heap size of over 32gb dedicated to each Elasticsearch instance on each platform. Java doesn't compress pointers over 32gb. Drop your nodes to only 32gb and, if you need to, spin up another instance.
  2. If spinning up another instance instance isn't an option and 32gb on 3 nodes isn't enough to run ES then you'll have to bump your heap memory to somewhere over 48gb!
  3. I would probably stick with the default settings for shards and replicas. 5 shards, 1 replica. However, you can tweak the shard settings to suit. What I would do is reindex the data in several indices under several different conditions. The first index would only have 1 shard, the second index would have 2 shards, I'd do this all the way up to 10 shards. Query each index and see which performs best. If the 10 shard index is the best performing one keep increasing the shard count until you get worse performance, then you've hit your shard limit.

One thing to think about though, sharding might increase search performance but it also has a massive effect on index time. The more shards the longer it takes to index a document...

You also have quite a bit of data stored, maybe you should look at Custom Routing too.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.