Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Paginating ClusterManager Read APIs] Paginate _cat/shards API. #14257

Closed
gargharsh3134 opened this issue Jun 13, 2024 · 2 comments · Fixed by #14641
Closed

[Paginating ClusterManager Read APIs] Paginate _cat/shards API. #14257

gargharsh3134 opened this issue Jun 13, 2024 · 2 comments · Fixed by #14641
Labels
Cluster Manager enhancement Enhancement or improvement to existing feature or request v2.17.0

Comments

@gargharsh3134
Copy link
Contributor

gargharsh3134 commented Jun 13, 2024

Is your feature request related to a problem? Please describe

As the number of shards grow in an opensearch cluster, the response size and the latency of the default _cat/shards API increases which makes it difficult not only for the client to consume such large responses, but also stresses out the cluster by making it accumulate stats across all the shards.

Thus, pagination will not only help in limiting the size of response for a single query but will also prevent the cluster from accumulating shards stats for all the shards. So, this issue tracks the approaches that can be used to paginate the response.

Describe the solution you'd like

Drawing inspiration and extending the approach called out in: #14258

A new _list/shards API would be introduced.

For paginating the response, a pagination key would be required for which a deterministic order is/can be maintained/generated in the cluster. Deterministic order is required for starting a new response page from the point where the last page left. Index creation timestamps will thus be used as pagination keys.

Overview

Each index has a creation timestamp stored in IndexMetadata which is part of Metadata object of ClusterState. These creation timestamps can act as sort/pagination key using which list of indices, sorted as per their respective creation timestamps, can be generated. The generated sorted list can then be used to prepare a list of shards to be sent in response as per the page size.

Proposed User Experience

New API Path/URL:

curl "localhost:9200/_list/shards?___
curl "localhost:9200/_list/shards/{indices}?___ where {indices} is a comma separated list of indices.

New Query Parameters:

Parameter Name Type Default Value Description
next_token String null to be used for fetching the next page
size Integer 2000 maximum number of shards that can be returned in a single page. The default value also serves as the minimum value that users can define. Specifying a value lesser than default value would result in Illegal_Argument_Exception
sort string/ENUM asc order of shards in a page. Allowed values will be "asc"/"desc". If "desc", most recently created shards would be displayed first

Sample Query -> curl "localhost:9200/_list/shards?next_token=<nextToken>&size=20000&sort=asc"

New Response Parameters:

Parameter Name Type Description
next_token String to be used for fetching the next page

Note: The next_token would be Base64 encoded.

New Response Formats:

format=JSON: next_token, and shards will be new keys of the JSON response object.


{
  "next_token" : "nextToken",
  "shards" : [{
      "index" : "test-ind",
      "shard" : "0",
      "prirep" : "r",
      "state" : "STARTED",
      "docs" : "0",
      "store" : "208b",
      "ip" : "127.0.0.1",
      "node" : "data4"
    },
    {
      "index" : "test-ind",
      "shard" : "0",
      "prirep" : "p",
      "state" : "STARTED",
      "docs" : "0",
      "store" : "208b",
      "ip" : "127.0.0.1",
      "node" : "data1"
    }]
}

Plain text format (or table format): next_token will be the last row of the table.

test-index 1 p STARTED 0 208b 127.0.0.1 data1
test-index 1 r STARTED 0 208b 127.0.0.1 data5
test-index 1 r STARTED 0 208b 127.0.0.1 data4
test-index 1 r STARTED 0 208b 127.0.0.1 data3
next_token MCQw

Proposed Pagination Behaviour

Note: The indices which might get created while the paginated queries are being executed, will be referred to as newly created indices for the following points.

  1. Number of shards in a page will always be less than or equal to the user provided size query parameter iff user provided value is greater than the default value (10k). i.e. page_size = max(userProvidedMaxPageSize, defaultMaxPageSize).
    Given that shardRoutings for a shardID do NOT have any unique identifiers, it becomes difficult to define a strategy which can help start the next page from the point where last page left incase shards corresponding to a shardID get split across pages. So, shards for a shardID should NOT split/span across pages and need to be displayed in a single page. This limitation then inherently imposes a restriction on pageSize, i.e. minimum pageSize should always be greater than the maximum of number of replicas across all the indices in the cluster.
    min(max_page_size) = max(#replicasOfIndex1, #replicasOfIndex2, #replicasOfIndex3, ....).
    With this restriction on max_page_size, it is being proposed to set a high value of default max_page_size and use that incase user provided value is lesser than it.

  2. Displaying shards for newly created indices will depend on the requested sort type.
    If sort is specified as "asc", then the newly created shards will be shown as part of rear end of the response pages. However, for sort specified as "desc", because the subsequent pages will only contain the shards which are older than the already displayed ones, newly created shards will be filtered out.

  3. Any shard for an index yet to be displayed, if deleted will NOT be a part of response.

Implementation Details:

Extending and implementing the classes and interfaces called out under #14258, and introducing a new ShardPaginationStrategy class which would encompass the core logic to generate pages of shards.

Related component

Cluster Manager

Describe alternatives you've considered

Pagination key is NodeID.

The idea here will be to respond with all the shards on a set of nodes in a single page.

The concern with having NodeID as a pagination key is that, it is not agnostic to index creations or cluster re-balancing activities which could happen while the queries are getting executed.

Additional context

No response

@gargharsh3134 gargharsh3134 added enhancement Enhancement or improvement to existing feature or request untriaged labels Jun 13, 2024
@rwali-aws rwali-aws moved this from 🆕 New to Now(This Quarter) in Cluster Manager Project Board Jun 20, 2024
@dblock dblock removed the untriaged label Jul 1, 2024
@dblock
Copy link
Member

dblock commented Jul 1, 2024

[Catch All Triage - Attendees 1, 2, 3, 4, 5]

@backslasht
Copy link
Contributor

It makes sense not to have NodeId as the pagination key. But, can we introduce node id as filter in the API?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cluster Manager enhancement Enhancement or improvement to existing feature or request v2.17.0
Projects
Status: ✅ Done
Status: 2.17 (First RC 09/03, Release 09/17)
Development

Successfully merging a pull request may close this issue.

4 participants