ElasticSearch中search after處理深分頁介紹
原文地址:https://www.elastic.co/guide/en/elasticsearch/reference/5.5/search-request-search-after.html
Search Afteredit
Pagination of results can be done by using the from
and size
but
the cost becomes prohibitive when the deep pagination is reached. The index.max_result_window
which
defaults to 10,000 is a safeguard, search requests take heap memory and time proportional to from
+ size
search_after
parameter
circumvents this problem by providing a live cursor. The idea is to use the results from the previous page to help the retrieval of the next page.Suppose that the query to retrieve the first page looks like this:
GET twitter/tweet/_search {"size":10,"query":{"match":{"title":"elasticsearch"}},"sort":[{"date":"asc"},{"_uid":"desc"}]}
A field with one unique value per document should be used as the tiebreaker of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way
is to use the field _uid
The result from the above request includes an array of sort
values
for each document. These sort
values
can be used in conjunction with the search_after
parameter
to start returning results "after" any document in the result list. For instance we can use the sort
values
of the last document and pass it to search_after
to
retrieve the next page of results:
GET twitter/tweet/_search {"size":10,"query":{"match":{"title":"elasticsearch"}},"search_after":[1463538857,"tweet#654323"],"sort":[{"date":"asc"},{"_uid":"desc"}]}
The parameter from
must
be set to 0 (or -1) when search_after
is
used.
search_after
is
not a solution to jump freely to a random page but rather to scroll many queries in parallel. It is very similar to the scroll
API
but unlike it, the search_after
parameter
is stateless, it is always resolved against the latest version of the searcher. For this reason the sort order may change during a walk depending on the updates and deletes of your index.