Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support pit and search_after in server savedObjects.find #89915

Merged
merged 37 commits into from
Feb 11, 2021

Conversation

lukeelmers
Copy link
Member

@lukeelmers lukeelmers commented Feb 1, 2021

Closes #86301

Summary

Adds support for both pit and search_after to Saved Objects on the server, enabling folks to reliably page through sets of results even if exceeding the index.max_result_window (or more accurately, if exceeding Kibana's maxImportExportSize config value).

Changes

  • Adds searchAfter option to SO.find
  • Adds pit option to SO.find
  • Adds pit_id to SearchResponse
  • Adds openPointInTimeForType to SO client, which allows you to pass a type (or list of types), and have a PIT opened against the index/indices backing the type(s)
  • Also adds closePointInTime for cleaning up a PIT via SO client (without requiring you to use the Elasticsearch service)
  • Updates security/spaces/audit logging to support openPointInTimeForType and closePointInTime
  • Integrates pit/searchAfter with SO exports, so you can export objects that exceed the max_result_window default of 10000.
  • Adds a PointInTimeFinder which returns a generator to make paging through results easier. Plan is to eventually expose this as a public helper.
  • Update sort value with a tiebreaker field

Usage

With find

   const { id } = await savedObjectsClient.openPointInTimeForType(
     type: 'visualization',
     { keepAlive: '5m' },
   );

   const page1 = await savedObjectsClient.find({
     type: 'visualization',
     sortField: 'updated_at',
     sortOrder: 'asc',
     pit: { id },
   });
  
   const lastHit = page1.saved_objects[page1.saved_objects.length - 1];

   const page2 = await savedObjectsClient.find({
     type: 'visualization',
     sortField: 'updated_at',
     sortOrder: 'asc',
     pit: { id: page1.pit_id, keepAlive: '2m' },
     searchAfter: lastHit.sort,
   });
  
   await savedObjectsClient.closePointInTime(page2.pit_id);

Finder helper

 const findOptions: SavedObjectsFindOptions = {
   type: 'visualization',
   search: 'foo*',
   perPage: 100,
 };

 const finder = createPointInTimeFinder({
   logger,
   savedObjectsClient,
   findOptions,
 });

 const responses: SavedObjectFindResponse[] = [];
 for await (const response of finder.find()) {
   responses.push(...response);
   if (doneSearching) {
     await finder.close();
   }
 }

Follow-up tasks (will create a separate issue)

  • Documentation for maxImportExportSize
  • Update validation of maxImportExportSize to be number instead of bytes (which is incorrect)
  • Mention the config option or link to the docs from the error message https://github.com/elastic/kibana/blob/master/src/core/server/saved_objects/export/errors.ts#L30
  • Now that we support > 10k exports consider bumping default max to a larger size, as long as it doesn't consume so much memory that Kibana becomes unstable. It will depend a lot on the size of the specific saved objects, but maybe we can test if e.g. 20k objects would really be a problem.
  • Expose find w/ PIT generator somewhere as a public helper (perhaps on SO client directly)

Copy link
Contributor

@pgayvallet pgayvallet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments after taking a first look

@lukeelmers lukeelmers self-assigned this Feb 4, 2021
@lukeelmers lukeelmers force-pushed the feat/so-find-after branch 3 times, most recently from c3dfcf1 to 5eb5577 Compare February 7, 2021 04:55
@lukeelmers lukeelmers added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Feb 8, 2021
@lukeelmers lukeelmers marked this pull request as ready for review February 8, 2021 04:31
@lukeelmers lukeelmers requested review from a team as code owners February 8, 2021 04:31
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@lukeelmers
Copy link
Member Author

@elasticmachine merge upstream

(most of the functional test failures were due to #88737 not being merged yet)

Copy link
Member

@legrego legrego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass through, mostly questions and nits

Copy link
Contributor

@streamich streamich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AppServices change LGTM.

@lukeelmers lukeelmers changed the title [core.savedObjects] Support pit and search_after in savedObjects.find [core.savedObjects] Support pit and search_after in savedObjects.find on the server Feb 11, 2021
@lukeelmers
Copy link
Member Author

SavedObject expert ™️

BRB, need to go get a tattoo. 😄

@lukeelmers lukeelmers changed the title [core.savedObjects] Support pit and search_after in savedObjects.find on the server Support pit and search_after in server savedObjects.find Feb 11, 2021
@lukeelmers lukeelmers enabled auto-merge (squash) February 11, 2021 17:54
@lukeelmers
Copy link
Member Author

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@lukeelmers lukeelmers merged commit 7fc3d12 into elastic:master Feb 11, 2021
@lukeelmers lukeelmers deleted the feat/so-find-after branch February 11, 2021 19:58
@kibanamachine
Copy link
Contributor

Backport result

{"level":"info","message":"POST https://api.github.com/graphql (status: 200)"}
{"level":"info","message":"POST https://api.github.com/graphql (status: 200)"}
{"meta":{"labels":["Feature:Saved Objects","Team:Core","auto-backport","release_note:enhancement","v7.12.0","v8.0.0"],"branchLabelMapping":{"^v8.0.0$":"master","^v7.12.0$":"7.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"},"existingTargetPullRequests":[]},"level":"info","message":"Inputs when calculating target branches:"}
{"meta":["7.x"],"level":"info","message":"Target branches inferred from labels:"}
{"meta":{"killed":false,"code":2,"signal":null,"cmd":"git remote rm kibanamachine","stdout":"","stderr":"error: No such remote: 'kibanamachine'\n"},"level":"info","message":"exec error 'git remote rm kibanamachine':"}
{"meta":{"killed":false,"code":2,"signal":null,"cmd":"git remote rm elastic","stdout":"","stderr":"error: No such remote: 'elastic'\n"},"level":"info","message":"exec error 'git remote rm elastic':"}
{"level":"info","message":"Backporting [{\"sourceBranch\":\"master\",\"targetBranchesFromLabels\":[\"7.x\"],\"sha\":\"7fc3d125bf569636cd4cc45ac9b73ffb8c8733e3\",\"formattedMessage\":\"Support `pit` and `search_after` in server `savedObjects.find` (#89915)\",\"originalMessage\":\"Support `pit` and `search_after` in server `savedObjects.find` (#89915)\",\"pullNumber\":89915,\"existingTargetPullRequests\":[]}] to 7.x"}

Backporting to 7.x:
{"level":"info","message":"Backporting via filesystem"}
{"meta":{"killed":false,"code":1,"signal":null,"cmd":"git cherry-pick 7fc3d125bf569636cd4cc45ac9b73ffb8c8733e3","stdout":"Auto-merging test/api_integration/apis/saved_objects/resolve_import_errors.ts\nAuto-merging test/api_integration/apis/saved_objects/export.ts\nAuto-merging src/plugins/data/server/server.api.md\nAuto-merging src/core/server/server.api.md\nAuto-merging src/core/server/saved_objects/service/lib/repository.ts\nAuto-merging src/core/server/saved_objects/service/lib/repository.test.js\nAuto-merging src/core/public/public.api.md\nAuto-merging docs/development/core/public/kibana-plugin-core-public.doclinksstart.md\nCONFLICT (content): Merge conflict in docs/development/core/public/kibana-plugin-core-public.doclinksstart.md\n","stderr":"error: could not apply 7fc3d125bf5... Support `pit` and `search_after` in server `savedObjects.find` (#89915)\nhint: after resolving the conflicts, mark the corrected paths\nhint: with 'git add <paths>' or 'git rm <paths>'\nhint: and commit the result with 'git commit'\n"},"level":"info","message":"exec error 'git cherry-pick 7fc3d125bf569636cd4cc45ac9b73ffb8c8733e3':"}
{"meta":{"killed":false,"code":2,"signal":null,"cmd":"git --no-pager diff --check","stdout":"docs/development/core/public/kibana-plugin-core-public.doclinksstart.md:20: leftover conflict marker\ndocs/development/core/public/kibana-plugin-core-public.doclinksstart.md:23: leftover conflict marker\ndocs/development/core/public/kibana-plugin-core-public.doclinksstart.md:25: leftover conflict marker\n","stderr":""},"level":"info","message":"exec error 'git --no-pager diff --check':"}
Commit could not be cherrypicked due to conflicts

@bhavyarm
Copy link
Contributor

@lukeelmers how do I test this please? Thanks!

@lukeelmers
Copy link
Member Author

@bhavyarm Ah, sorry - forgot to include testing notes:

  1. From Stack Management > Saved Objects, try exporting saved objects & verify it still works (you can also export using the saved objects _export REST API directly)
  2. In kibana.yml config, set savedObjects.maxImportExportSize to a number that's >10000
  3. Create >10k saved objects (probably easiest to do this via the import api, I have a script that can help with it if you need it -- just let me know).
  4. Try exporting again. With this PR you should be able to export >10k saved objects without issues.
  5. Also verify that if you set savedObjects.maxImportExportSize to something like 10001, and attempt to export 10002 objects, you should get an error. The export size should never be allowed to exceed that setting.

Also worth mentioning that we do have some functional tests here that verify exporting >10k saved objects works.

FrankHassanabad added a commit that referenced this pull request Feb 15, 2022
… from saved objects to exception lists (#125182)

## Summary

Exposes the functionality of
* search_after
* point in time (pit)

From saved objects to the exception lists. This _DOES NOT_ expose these to the REST API just yet. Rather this exposes it at the API level to start with and changes code that had hard limits of 10k and other limited loops. I use the batching of 1k for this at a time as I thought that would be a decent batch guess and I see other parts of the code changed to it. It's easy to change the 1k if we find we need to throttle back more as we get feedback from others.

See this PR where `PIT` and `search_after` were first introduced: #89915
See these 2 issues where we should be using more paging and PIT (Point in Time) with search_after: #93770 #103944

The new methods added to the `exception_list_client.ts` client class are:
* openPointInTime
* closePointInTime
* findExceptionListItemPointInTimeFinder
* findExceptionListPointInTimeFinder
* findExceptionListsItemPointInTimeFinder
* findValueListExceptionListItemsPointInTimeFinder

The areas of functionality that have been changed:
* Exception list exports
* Deletion of lists
* Getting exception list items when generating signals

Note that currently we use our own ways of looping over the saved objects which you can see in the codebase such as this older way below which does work but had a limitation of 10k against saved objects and did not do point in time (PIT)

Older way example (deprecated):
```ts
  let page = 1;
  let ids: string[] = [];
  let foundExceptionListItems = await findExceptionListItem({
    filter: undefined,
    listId,
    namespaceType,
    page,
    perPage: PER_PAGE,
    pit: undefined,
    savedObjectsClient,
    searchAfter: undefined,
    sortField: 'tie_breaker_id',
    sortOrder: 'desc',
  });
  while (foundExceptionListItems != null && foundExceptionListItems.data.length > 0) {
    ids = [
      ...ids,
      ...foundExceptionListItems.data.map((exceptionListItem) => exceptionListItem.id),
    ];
    page += 1;
    foundExceptionListItems = await findExceptionListItem({
      filter: undefined,
      listId,
      namespaceType,
      page,
      perPage: PER_PAGE,
      pit: undefined,
      savedObjectsClient,
      searchAfter: undefined,
      sortField: 'tie_breaker_id',
      sortOrder: 'desc',
    });
  }
  return ids;
```

But now that is replaced with this newer way using PIT:
```ts
  // Stream the results from the Point In Time (PIT) finder into this array
  let ids: string[] = [];
  const executeFunctionOnStream = (response: FoundExceptionListItemSchema): void => {
    const responseIds = response.data.map((exceptionListItem) => exceptionListItem.id);
    ids = [...ids, ...responseIds];
  };

  await findExceptionListItemPointInTimeFinder({
    executeFunctionOnStream,
    filter: undefined,
    listId,
    maxSize: undefined, // NOTE: This is unbounded when it is "undefined"
    namespaceType,
    perPage: 1_000,
    savedObjectsClient,
    sortField: 'tie_breaker_id',
    sortOrder: 'desc',
  });
  return ids;
```

We also have areas of code that has perPage listed at 10k or a constant that represents 10k which this removes in most areas (but not all areas):
```ts
      const items = await client.findExceptionListsItem({
        listId: listIds,
        namespaceType: namespaceTypes,
        page: 1,
        pit: undefined,
        perPage: MAX_EXCEPTION_LIST_SIZE, // <--- Really bad to send in 10k per page at a time
        searchAfter: undefined,
        filter: [],
        sortOrder: undefined,
        sortField: undefined,
      });
```

That is now:
```ts
      // Stream the results from the Point In Time (PIT) finder into this array
      let items: ExceptionListItemSchema[] = [];
      const executeFunctionOnStream = (response: FoundExceptionListItemSchema): void => {
        items = [...items, ...response.data];
      };

      await client.findExceptionListsItemPointInTimeFinder({
        executeFunctionOnStream,
        listId: listIds,
        namespaceType: namespaceTypes,
        perPage: 1_000,
        filter: [],
        maxSize: undefined, // NOTE: This is unbounded when it is "undefined"
        sortOrder: undefined,
        sortField: undefined,
      });
```

Left over areas will be handled in separate PR's because they are in other people's code ownership areas.

### Checklist
- [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed Feature:Saved Objects release_note:enhancement Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v7.12.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add search_after support to SO find API using PIT
8 participants