3

Due to low disk space and a large amount of deleted documents inside one of my index, I need to do an optimize command (ElasticSearch 1.7)

Right now, the index has the following stats:

shards: 15 * 1 | docs: 23,165,760 | size: 1.25TB

  • Will the optimize API block any indexing/query operation untill the optimization is done?
  • Will the optimize API affects operations on the other indexes?
  • Is it possible to have an approximate time to know how long it will take?

Sorry for my bad english :)

And let me know if you need any further stats

1 Answer 1

4

Will the optimize API block any indexing/query operation untill the optimization is done?

No, it can run in parallel, but the indexing process will affect the optimization. New segments are created, those are subject to optimization as well...

Will the optimize API affects operations on the other indexes?

Not directly, but indirectly by using additional CPU, memory and disk.

Is it possible to have an approximate time to know how long it will take?

Nop :-), maybe only by testing upfront and extrapolating to the number of documents/segments.

Be careful that optimization will require additional disk space. If you optimize to a very low number of segments, the optimization process will most likely try to optimize a set of very large segments in the end which means it will need an additional (largeSegment1_size + largeSegment2_size + ....) disk space. The old segments are deleted only when the resulting merged segment is complete.

Also, look at only_expunge_deletes option for an alternative.

Another advice is to perform the optimization when there is less load on the cluster. As I mentioned the optimization requires additional CPU, memory and disk space resources.

Sign up to request clarification or add additional context in comments.

6 Comments

Thank you Andrei :) Is it possible eventually to block optimization process? Maybe stopping the index. Is there any flag where I can understand if optimization is ended?
I don't think it's possible to stop the optimization. You can check if there are active optimizations with GET /_nodes/stats/thread_pool and look for optimize section.
It seems optimize has ended up. No disk space has been released. I just called the optimize command with kopf plugin. I suppose that no expunge deletes parameter has been set. Do I have to send the command: POST /my_index/_optimize?only_expunge_deletes=true I thought that optimize API would free disk space, even without that parameter
Are you sure you had documents marked as deleted in that index?
With only_expunge_deletes parameter all is working! :) Now i have much more free disk space Optimization is still in processing. Right now, from the Optimization stats, I see that queue is 75, completed 29. Threads is just 1. Is it possible to set a larger number of optimization thread to end it up in a faster way?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.