Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sort and pagination support for QueryApiKey API #76144

Merged
merged 14 commits into from
Aug 13, 2021

Conversation

ywangd
Copy link
Member

@ywangd ywangd commented Aug 5, 2021

This PR adds support for sort and pagination similar to those used with
regular search API. Similar to the query field, the sort field also
supports only a subset of what is available for regular search.

Relates: #71023

This PR adds support for sort and pagination similar to those used with
regular search API. Similar to the query field, the sort field also
supports only a subset of what is available for regular search.
@ywangd ywangd added >enhancement :Security/Security Security issues without another label v7.15.0 v8.0.0-alpha1 labels Aug 5, 2021
@ywangd ywangd requested a review from tvernum August 5, 2021 07:00
@elasticmachine elasticmachine added the Team:Security Meta label for security team label Aug 5, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

private boolean filterForCurrentUser;

public QueryApiKeyRequest() {
this((QueryBuilder) null);
}

public QueryApiKeyRequest(QueryBuilder queryBuilder) {
this.queryBuilder = queryBuilder;
this(queryBuilder, -1, -1, null, null);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default size is -1, which means it will use the default value from regular search, ie. 10. We could bump this to a higher value, but I think it also has value to be consistent with the default search behaviour.

Comment on lines 60 to 62
.field("total", total)
.field("count", foundApiKeysInfo.length)
.array("api_keys", (Object[]) foundApiKeysInfo);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add a sort field here in the response because adding it would make the response look less pleasing, i.e. it would be

{ "api_key": [ {"api_key":{...}, "sort":[...]}, {} ] }

instead of just
{ "api_key": [ {...}, {...} ] }

When sort is not required, the more nested version looks even more unnecessary. Also, unlike regular search response, all sort values are present in the response (api_key) already since there will be no filtering. So the information can be readliy parsed from each api_key item.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realised there is one minor caveat of not showing sort values: They can have different formats than the original value. For example, creation_time can be formatted to be more human readable. So instead of

"sort": [ 1626237059434 ]

it will be the following more human readable format

"sort" : [ "2021-07-14T04:30:59.434Z" ]

Note this difference does not seem to affect how search_after works. That is, you can pass specify either "search_after": [1626237059434] or "search_after": ["2021-07-14T04:30:59.434Z"] and they both work.

In general, while the human readable format of sort values is nice to have, I don't think it is worth to complicate the response especially since search_after does not get impacted.

@tvernum
Copy link
Contributor

tvernum commented Aug 13, 2021

One question that probably should have been raised earlier in the process:

  • How are we ensuring that there is a tie breaker in the sort? (Or at least that caller can include one). I think the only guaranteed unique value is id and that's not a valid sort option

@ywangd
Copy link
Member Author

ywangd commented Aug 13, 2021

One question that probably should have been raised earlier in the process:

  • How are we ensuring that there is a tie breaker in the sort? (Or at least that caller can include one). I think the only guaranteed unique value is id and that's not a valid sort option

This is a good question. Thanks for bringing this up. I didn't support id for sorting because (1) It didn't feel useful to sort by id alone (I didn't think about tie breaker) and (2) The field id is not allowed for the query part.

I think we have a few options to consider:

  1. Support _id for sorting. This name (with leading underscore) is more consistent with the IdsQuery used in the Query DSL. The downsides are: (1) Inconsisent with the response where the field name id is used; (2) Needs deprecation and translation (similar to IdsQuery) once id becomes an actual field for ApiKey document.
  2. Support id for sorting (without the leading underscore). It is more consisent with the response format, but raises the question why the same field is not supported in the Query DSL.
  3. Support id for both sorting and query. This means replacing IdsQuery with TermQuery and TermsQuery. For case-insensitive TermQuery and other query types that cannot be translate into IdsQuery, we can just throw errors and document it.

My preference is Option 2 because:

  • It is simple enough
  • We can document the limitation and it is likely less to document compare to Option 3
  • It is closer to the future state than Option 1, i.e. id becomes an actual field.

Option 3 is a close call. It has the benefit of not using IdsQuery. If we value this more, I am happy go with Option 3 as well. What do you think?

@ywangd
Copy link
Member Author

ywangd commented Aug 13, 2021

Also it is unfortunate that we cannot use runtime field because it requires search.allow_expensive_queries. Otherwise a runtime mapping like

{
  "runtime_mappings": {
    "id": {
      "type": "keyword",
      "script": {
        "source": "emit(doc['_id'].value)"
      }
    }
  },
  ...
}

can easily solve the dilemma here. I remember there were previous asks about whether it is possible to always allow expensive queries for internal actions regardless of search.allow_expensive_queries. I don't recall any followup discussions but I guess it is probably unlikely to fly.

@tvernum
Copy link
Contributor

tvernum commented Aug 13, 2021

Are we sure we want to support sorting on _id anyway?

The latest docs no longer have this section, but the pre-PIT docs for search_after explicitly advise against it
https://www.elastic.co/guide/en/elasticsearch/reference/7.9/paginate-search-results.html#search-after

@ywangd
Copy link
Member Author

ywangd commented Aug 13, 2021

Are we sure we want to support sorting on _id anyway?

The latest docs no longer have this section, but the pre-PIT docs for search_after explicitly advise against it
https://www.elastic.co/guide/en/elasticsearch/reference/7.9/paginate-search-results.html#search-after

Errr Great infomation. It is also documented here. In this case, no, we should not sort on it.

PIT always needs tiebreaker and for that it has a concept of _shard_doc, which is a combination of shard index and document number (_doc). It can only be used with PIT so not applicable here. However since .security only has a single primary shard, _doc alone should be sufficient as a tiebreaker. _doc is also the most efficient way of sorting according to the documentation.

So supporting _doc for sorting seems to be the best option. We can either choose to support it explicitly, i.e. user can specify it in request, or always implicit append it to any sort (I prefer explicilt support because the lengths of sort values and search_after match the length of sort fields). Since _doc is not part of the response, I think this means we'll have to show sort values for each API key. To avoid extra nested layer, how about:

{ 
  "api_key": [ 
    {
      "id": "xxx",
      "name": "xxx",
      "sort": [ ...]
    } 
  ]
}

@tvernum
Copy link
Contributor

tvernum commented Aug 13, 2021

Another option is to just return the sort values of the item so it would be:

{
  "total": 700,
  "count": 100,
  "api_keys": [ ... ],
  "sort": [ "name123", 100 ]
}

but it's a bit weird to be different for the sake of being different ...

@tvernum
Copy link
Contributor

tvernum commented Aug 13, 2021

My preference would probably be:

{ 
  "api_keys": [ 
    {
      "id": "xxx",
      "name": "xxx",
      "_sort": [ ...]
    } 
  ]
}

but I'm open to the other options too.

@ywangd
Copy link
Member Author

ywangd commented Aug 13, 2021

I am ok with _sort as well. I suggested sort because that's how it is named in the response of regular search. But I think it's ok to be different here since the whole response is quite different already anyway.

@ywangd
Copy link
Member Author

ywangd commented Aug 13, 2021

@tvernum Thanks for the comments. The PR is now updated based on our discussion:

  • It now support sort on _doc.
  • The sort values are included in the response as {"api_keys": [ {"id":"...", "name":"...", "_sort": [...], ...}, ...] }

final int from = randomIntBetween(0, 3);
final int size = randomIntBetween(2, 5);
final int remaining = total - from;
final String sortField = randomFrom("name", "creation");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doens't this assume that no to API keys have the same creation time?
Are we sure that's always going to be true in this test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added _doc as a 2nd sort field. Since keys are created sequential and with wait_for, I don't think it is possible to have the same creation time. But it's a good place to exercise _doc some more and is more future proof (e.g. keys may be created in parallel).

@ywangd ywangd merged commit 245ba38 into elastic:master Aug 13, 2021
ywangd added a commit to ywangd/elasticsearch that referenced this pull request Aug 13, 2021
This PR adds support for sort and pagination similar to those used with
regular search API. Similar to the query field, the sort field also
supports only a subset of what is available for regular search.
ywangd added a commit that referenced this pull request Aug 14, 2021
This PR adds support for sort and pagination similar to those used with
regular search API. Similar to the query field, the sort field also
supports only a subset of what is available for regular search.
ywangd added a commit that referenced this pull request Aug 17, 2021
This PR adds HLRC for the new Query API key API added with #75335 and #76144

Relates: #71023
ywangd added a commit to ywangd/elasticsearch that referenced this pull request Aug 17, 2021
This PR adds HLRC for the new Query API key API added with elastic#75335 and elastic#76144

Relates: elastic#71023
ywangd added a commit that referenced this pull request Aug 18, 2021
This PR adds HLRC for the new Query API key API added with #75335 and #76144

Relates: #71023
ywangd added a commit that referenced this pull request Aug 19, 2021
This PR adds documentation for the new QueryApiKey API added in #75335 and #76144

Relates: #71023
ywangd added a commit to ywangd/elasticsearch that referenced this pull request Aug 19, 2021
This PR adds documentation for the new QueryApiKey API added in elastic#75335 and elastic#76144

Relates: elastic#71023
ywangd added a commit to ywangd/elasticsearch that referenced this pull request Aug 19, 2021
This PR adds documentation for the new QueryApiKey API added in elastic#75335 and elastic#76144

Relates: elastic#71023
elasticsearchmachine pushed a commit that referenced this pull request Aug 19, 2021
This PR adds documentation for the new QueryApiKey API added in #75335 and #76144

Relates: #71023
elasticsearchmachine pushed a commit that referenced this pull request Aug 19, 2021
This PR adds documentation for the new QueryApiKey API added in #75335 and #76144

Relates: #71023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Security/Security Security issues without another label Team:Security Meta label for security team v7.15.0 v8.0.0-alpha2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants