1

I am using the following code to loop/scroll over all documents in my elastic search box:

const string indexName = "bla";
var client = GetClient(indexName);
const int scrollTimeout = 1000;

var initialResponse = client.Search<Document>
    (scr => scr.Index(indexName)
    .From(0)
    .Take(100)
    .MatchAll()
    .Scroll(scrollTimeout))
;

List<XYZ> results;
results = new List<XYZ>();

if (!initialResponse.IsValid || string.IsNullOrEmpty(initialResponse.ScrollId))
throw new Exception(initialResponse.ServerError.Error.Reason);

if (initialResponse.Documents.Any())
results.AddRange(initialResponse.Documents);

var scrollid = initialResponse.ScrollId;
bool isScrollSetHasData = true;
while (isScrollSetHasData)
{
    var loopingResponse = client.Scroll<XYZ>(scrollTimeout, scrollid);

    if (loopingResponse.IsValid)
    {
        results.AddRange(loopingResponse.Documents);
        scrollid = loopingResponse.ScrollId;
    }
    isScrollSetHasData = loopingResponse.Documents.Any();

    // do some amazing stuff
}

client.ClearScroll(new ClearScrollRequest(scrollid));

For some reason loopingResponse is empty much sooner than expected - i.e. the scroll finishes. Can someone see something fundamentally wrong with my code? Thanks!

6
  • What version of NEST & elasticsearch do you use? Commented Dec 10, 2019 at 17:56
  • 7.0 thanks for comment Commented Dec 10, 2019 at 18:00
  • are you sure that scrollTimeout is enough to keep the search context alive? because if not then the response will come back without documents obviously and stop there Commented Dec 10, 2019 at 20:41
  • Thanks I have to say that I do not fully understand this parameter. Where exactly is the doc? There are so many removed and different pages for different versions. Is this in millisecond btw? Could I set it to some maximum? Thanks! Commented Dec 10, 2019 at 20:54
  • to get rid of confusion I usually do something like .Scroll(new Time(TimeSpan.FromMinutes(3))) Commented Dec 10, 2019 at 21:36

1 Answer 1

1

Looking at your code I think scrollTimeout could be the problem. Usually scroll is used for big chunks of data to be returned and 1000ms is not enough to keep the search context alive between requests. You could try to increase it to several minutes to find the best number for your case:

var scrollTimeout = new Time(TimeSpan.FromMinutes(3));

or according to source code you could use Time units (micros, nanos, ms, s, m, h, and d):

var response = client.Search<Document>(scr => scr.Index(indexName)
    ...
    .Scroll("3m")
    );
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.