2

I have a ES DB storing history records from a process I run every day. Because I want to show only 20 records per page in the history (order by date), I was using pagination (size + from_) combined scroll, which worked just fine. But when I wanted to used sort in the query it didn't work. So I found that scroll with sort don't work. Looking for another alternative I tried the ES helper scan which works fine for scrolling and sorting the results, but with this solution pagination doesn't seem to work, which I don't understand why since the API says that scan sends all the parameters to the underlying search function. So my question is if there is any method to combine the three options.

Thanks,

Ruben

2 Answers 2

5

When using the elasticsearch.helpers.scan function, you need to pass preserve_order=True to enable sorting.

(Tested using elasticsearch==7.5.1)

Sign up to request clarification or add additional context in comments.

1 Comment

The documentation for this argument says that "this can be an extremely expensive operation and can easily lead to unpredictable results".
1

yes, you can combine scroll with sort, but, when you can sort string, you will need change the mapping for it works fine, Documentation Here

In order to sort on a string field, that field should contain one term only: the whole not_analyzed string. But of course we still need the field to be analyzed in order to be able to query it as full text.

The naive approach to indexing the same string in two ways would be to include two separate fields in the document: one that is analyzed for searching, and one that is not_analyzed for sorting.

"tweet": { 
    "type":     "string",
    "analyzer": "english",
    "fields": {
        "raw": { 
            "type":  "string",
            "index": "not_analyzed"
        }
    }
}
  • The main tweet field is just the same as before: an analyzed full-text field.
  • The new tweet.raw subfield is not_analyzed.

Now, or at least as soon as we have reindexed our data, we can use the tweet field for search and the tweet.raw field for sorting:

GET /_search
    {
        "query": {
            "match": {
                "tweet": "elasticsearch"
            }
        },
        "sort": "tweet.raw"
    }

6 Comments

No I can't. If I use the scroll command the results I get are not sorted. Also I am not sorting on a string field.
You tried change the field mapping to not_analyzed? What field type you will sort?
Yes I tried changing it to not_analyzed but still doesn't work. I am filtering by date
When you see the mappping? It is {"type":"date","format":"dateOptionalTime"}?
Yes it is. But I don't think the problem is with the format of the field I am sorting with but the combination of sort, scroll and pagination. I can sort if I don't use scroll and I can use pagination with search but not when ordering...
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.