Adapt es-archiver to call `_migrate` endpoint instead of creating a migrator #56971

pgayvallet · 2020-02-06T12:28:18Z

Summary

Due to the fact that the savedObject types can now be registered in New Platform, and because these registrations are now made using programmatic API instead of static descriptors, the way the es-archiver worked until now (by manually parsing the plugin manifest to gather the types and schema, and then creates a migrator to migrate the index) is no longer an option. (See #55860 for full context).

This PR provides a replacement by doing the following:

Expose a new _migrate endpoint as an internal savedObject API.
Modify the es-archiver to call this endpoint instead of manually performing the migration.

Prerequisite for #50309, #50311, #50313

Checklist

Documentation was added for features that require explanation or tutorials
Unit or functional tests were updated or added to match the most common scenarios

For maintainers

This was checked for breaking API changes and was labeled appropriately

elasticmachine · 2020-02-06T12:30:24Z

Pinging @elastic/kibana-platform (Team:Platform)

elasticmachine · 2020-02-07T15:25:10Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: 41f56b3

History

💔 Build #25214 failed d72d774
💔 Build #25086 failed ec1a4ab
💔 Build #25067 failed 110187e
💔 Build #24990 failed 110187e
💔 Build #24909 failed b3d6019

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

pgayvallet

comment on the modified data.json.gz files:

The only behavioural difference between the previous and new mechanism is that the server migrator only got knowledge of the types from the enabled plugins, where the embed es-archiver migrator was registering every types to the migrator's schema. This normally has no impact, however some specific test suites included data with migrationVersion for spaces, when the test suite had the spaces plugin disabled. This caused the bug logged here: #56731. Only 3 data files, all from the reporting plugin, are impacted.

@tsullivan This is the cause of the changes in the x-pack/test/functional/es_archives/reporting datas. I only removed the default space document, that wasn't used anyway as all the tests loading the data are running without spaces.

pgayvallet · 2020-02-07T15:14:31Z

packages/kbn-pm/dist/index.js

+    /**
+     * Run the saved objects migration
+     */
+    async migrate() {


TIL we need to manually generate and commit this file, and best part is it got mixed LF and CRLF line feeds.

pgayvallet · 2020-02-07T15:18:40Z

src/core/server/saved_objects/migrations/kibana/kibana_migrator.ts

-  public runMigrations(): Promise<Array<{ status: string }>> {
-    if (this.migrationResult === undefined) {
+  public runMigrations({ rerun = false }: { rerun?: boolean } = {}): Promise<
+    Array<{ status: string }>
+  > {
+    if (this.migrationResult === undefined || rerun) {
      this.migrationResult = this.runMigrationsInternal();
    }


This change is very naive: if a migration is already running and we call runMigration again with return=true, multiple migrations can occur at the same time.

We should maybe wait for the pending migration to finish before triggering the next one. But that adds some complex logic to handle edge cases such as multiple rerun=true calls when a migration is already running. We should wait for the migration to finish but only re-execute ONCE and return the same promise for every calls to rerun that occurred during the initial migration.

As this API is internal I though current implementation was acceptable. Tell me if we should fireproof it more.

so, in theory, it could lead to a race condition if 2 migrations run over the same data?

In theory it could yes, even if I don't think it could leads to data corruption, as in worse case, both 'threads' will access the data before any actually migrated it, and the document would be migrated then updated 'twice' from the same original data, which would only impact performances.

I can try to improve that by 'pooling' all the calls to runMigration(rerun=true) while one migration is running and only execute the rerun once, as explained in previous comment, if we think current implementation is too fragile.

It's okay as long as this API is used only for tests internally. Could you add a comment about this known limitation?

pgayvallet · 2020-02-07T15:20:04Z

src/core/server/saved_objects/saved_objects_service.ts

+    registerRoutes({
+      http: setupDeps.http,
+      migratorPromise: this.migrator$.pipe(first()).toPromise(),
+    });


Registration is done during setup, the migrator is not yet created. Had to use a subject/promise as workaround.

src/es_archiver/lib/indices/kibana_index.ts

…iver-call-migrate-endpoint

joshdover · 2020-02-10T21:54:32Z

src/core/server/saved_objects/migrations/kibana/kibana_migrator.ts

@@ -95,8 +95,10 @@ export class KibanaMigrator {
   *    The promise resolves with an array of migration statuses, one for each
   *    elasticsearch index which was migrated.
   */
-  public runMigrations(): Promise<Array<{ status: string }>> {
-    if (this.migrationResult === undefined) {
+  public runMigrations({ rerun = false }: { rerun?: boolean } = {}): Promise<


Can we add a @param to the comment to explain what this option does / how / when it should be used?

packages/kbn-dev-utils/src/kbn_client/kbn_client_saved_objects.ts

src/core/server/saved_objects/routes/index.ts

src/es_archiver/lib/indices/kibana_index.ts

joshdover · 2020-02-10T22:02:18Z

src/core/server/saved_objects/routes/migrate.ts

+      validate: false,
+    },
+    async (context, req, res) => {
+      const migrator = await migratorPromise;


Could we log a warning if this endpoint is used in production env? (and maybe if the CI env var isn't set?)

Also there aren't any tests on this endpoint / logic. I know this is an internal API, but it still feels like we should test that we don't break this by accident. Though CI would probably fail if this broke, it would be much better if we had more insight into what broke to cause the failure.

Could we log a warning if this endpoint is used in production env? (and maybe if the CI env var isn't set?)

We are running the FTR suites against production cloud, so I'm not sure we should be logging a warning? At least we should not check the CI var env?

Also there aren't any tests on this endpoint / logic

Yea, I know. I can add basic unit test to check that 1/ the route is registered and 2/ calling the route properly calls the migrator. However I have no idea how to go further than that, as the API call is actually testing in the FTR. We can't really functional test a component used by the FTR framework itself, can we? Any insight on how to test that on a 'higher' level that unit is welcome.

I added integration tests on the endpoint to ensure it properly calls migrator.runMigrations with the correct parameters

I know this is an internal API

Is Platform going to promote any naming convention to clearly distinguish public (with BWC guarantees and documentation) and internal (no BWC guarantees, implicit use at your own risk warning and optional documentation) API endpoints? In Spaces/Security we started to define internal routes with /internal/ prefix instead of /api/ as an experiment that has been working well enough so far.

I'm a bit concerned that users will quickly discover this endpoint and may start using it in any imaginable context without realizing all the consequences leading to hard-to-handle/diagnose support cases....

Anyone against /internal/saved_objects/_migrate?

I don't want to introduce another convention without a broad discussion. as of me, internal could work, yes.

We are running the FTR suites against production cloud

By the way, does it mean that env.packageInfo (dist, buildNum etc.) in this case is completely indistinguishable from the one we have when Kibana is run in production outside of functional tests?

/internal/saved_objects/_migrate kinda breaks the SO route prefix. This is very minor, but forces to use a / based router when declaring routes instead of a /saved_objects one. Would /saved_objects/internal/_migrate be acceptable, or do we really want the route to starts with /internal ?

nevermind, did not understand the internal prefix was a replacement for the api one. Please forget my previous question

…iver-call-migrate-endpoint

joshdover · 2020-02-14T21:03:21Z

@elastic/kibana-security We could probably use some feedback on this from a security perspective. More context in the parent issue, but I just want to make sure there aren't security risks from this that we don't realize.

joshdover

Changes LGTM. Would like to wait on Security team before merging.

rudolf · 2020-02-17T09:58:39Z

@elasticmachine merge upstream

azasypkin

LGTM securitywise.

azasypkin · 2020-02-17T12:21:15Z

src/core/server/saved_objects/routes/integration_tests/migrate.test.ts

+    await root.setup();
+    await root.start();
+    migratorInstanceMock.runMigrations.mockClear();
+  }, 30000);


question: do we really need this explicit timeout? AFAIK we set large enough global timeout for all integration jest tests unless something has changed.

Probably it was added not to wait for 10 minutes to fail the test.

🤷‍♂️ if we think Kibana startup and CI are supposed to be faster now we can decrease global timeout and set it to something more reasonable once and for all (out of scope of this PR, just thinking aloud).

Probably it was added not to wait for 10 minutes to fail the test.

Yea, that was it.

if we think Kibana startup and CI are supposed to be faster now we can decrease global timeout and set it to something more reasonable once and for all

+1, but changing the global timeout is not always a pleasant surprise in term of results, this is way outside of the scope of this PR.

src/core/server/saved_objects/migrations/kibana/kibana_migrator.ts

azasypkin · 2020-02-17T12:38:07Z

src/core/server/saved_objects/routes/migrate.ts

+      validate: false,
+    },
+    async (context, req, res) => {
+      const migrator = await migratorPromise;


I know this is an internal API

Is Platform going to promote any naming convention to clearly distinguish public (with BWC guarantees and documentation) and internal (no BWC guarantees, implicit use at your own risk warning and optional documentation) API endpoints? In Spaces/Security we started to define internal routes with /internal/ prefix instead of /api/ as an experiment that has been working well enough so far.

I'm a bit concerned that users will quickly discover this endpoint and may start using it in any imaginable context without realizing all the consequences leading to hard-to-handle/diagnose support cases....

Co-Authored-By: Aleh Zasypkin <[email protected]>

azasypkin

I'm sorry to not asking these questions during my initial review, just drifted away while thinking about public vs internal APIs and lost the track 🙈

azasypkin · 2020-02-19T08:39:57Z

src/core/server/saved_objects/routes/migrate.ts

+) => {
+  router.post(
+    {
+      path: '/_migrate',


question: I'm wondering if it'd make sense to at least tie this to savedObjectsManagement:all privilege and use something like access:migrateSavedObjects tag here.

The only thing is that I can't recall a case when we did something like this for internal APIs. @kobelb what do you think? Am I missing something and there is a better way to manage access to internal APIs that perform actions on behalf of internal Kibana user (SO migration)?

If we're only using this endpoint internally for the functional test runner and don't intend for end-users in production to use this endpoint, I wouldn't tie this to the savedObjectsManagement:all privilege. End-users who get all access to the saved-object management feature would then be able to use this.

This route is rather different than our other "internal" routes. Our other internal routes are intended to be used from the browser, but end-users are authorized to access them. We don't really want this route to be used in production at all. We definitely ~~can~~ should use the access: tag on the route to restrict which users should have access to this route. However, we really only want it to be used by the functional test runner, which I don't believe has a real user to represent. If we used an access:migrateSavedObjects tag here, and then didn't add it to any privileges only a superuser would be able to invoke this API, as they're authorized to do everything.

The other stipulation is that relying on the security plugin's authorization won't apply when the user hasn't enabled security, which is the default. What are the repercussions of any user being able to start migrations in these scenarios?

I'm trying to come up with a bad scenario and coming up empty. Migrations only get run on objects from older versions, which the only way you could create objects from older versions is manually via direct Elasticsearch API calls, which only superusers and kibana_system should have access to do.

Adding access:migrateSavedObjects tag makes sense to me as an extra layer of precaution, but I don't think we need to do anything extra for the no-security scenario. Does that work for y'all?

Having just access:migrateSavedObjects sounds like a reasonable trade-off to me assuming it works with esArchiver and there is no reliable way no detect "FTR" scenario using env.packageInfo.dist/buildNum (to not register this endpoint at all otherwise or register it via custom test plugin or something along these lines).

@kobelb was that intentional decision for some reason or we just didn't have any /internal routes that would require authorization check at that time.

There were no /internal routes when this was written, and we decided to only run that route middleware when the url began with /api. We can likely remove that restriction entirely, as we're only performing the authorization logic on route which have the access: tag. I created #58351 to see the impact this has. @legrego do you recall any reason why we included the restriction that the route must start with /api originally?

@kobelb nope I don’t think we had a great reason, beyond what you already mentioned that we didn’t have any /internal routes at that time. I don’t see a problem removing this restriction

Just pushed the commit adding the access tag to the route, to see if that doesn't introduce another 'surprise' from the FTR call. Waiting for #58351 to change the endpoint from api to internal.

CI just passed on #58351, so I think we should be good, just waiting on a review at this point.

I see #58351 has been merged. Thanks a lot to the whole security team 😄 Will move the endpoint to the internal prefix tomorrow, and we should be good to go.

azasypkin · 2020-02-19T08:42:15Z

src/core/server/saved_objects/routes/migrate.ts

+    },
+    router.handleLegacyErrors(async (context, req, res) => {
+      const migrator = await migratorPromise;
+      const result = await migrator.runMigrations({ rerun: true });


question: is result going to be consumed by the client somehow? If not, maybe it'd be safer to not return it at all for now (ideal option, we can log though) or add type annotation here (so that TS complain if signature changes) or just pick specific fields in case migrator starts returning something more sensitive? I know it sounds a bit paranoid, but you never know.

Not atm. I can just return a {status: 'ok'} dummy response instead

azasypkin · 2020-02-19T08:55:32Z

src/core/server/saved_objects/routes/migrate.ts

+      path: '/_migrate',
+      validate: false,
+    },
+    router.handleLegacyErrors(async (context, req, res) => {


question: how unexpected migration errors look like (5xx)? May they include something that we don't want consumer to see? If so, we may need to try/catch errors in the handler, log them and return something more opaque.

This is mostly here to catch the elasticsearch service errors, that are currently thrown as boom errors. These errors are already thrown as-is from other SO endpoints, so I think there is no risk here, but I can try/catch and return a generic error instead if we think this is better.

These errors are already thrown as-is from other SO endpoints, so I think there is no risk here

If so, then yeah, it's fine as it's right now.

…iver-call-migrate-endpoint

…er-call-migrate-endpoint' into kbn-55860-es-archiver-call-migrate-endpoint

…iver-call-migrate-endpoint

pgayvallet · 2020-02-25T14:59:03Z

Last changes have been done. If anyone wants to take a final look, please do it today. I will merge the PR tomorrow

…iver-call-migrate-endpoint

kibanamachine · 2020-02-26T09:31:08Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: 502c042

History

💚 Build #28790 succeeded 4fdb39d
💔 Build #28754 failed 318fa14
💚 Build #28515 succeeded 26a15e5
💔 Build #28476 failed 96b1315
💚 Build #28375 succeeded 10b8fac

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

…igrator (elastic#56971) * es-archiver call _migrate endpoint instead of creating a migrator * fix crlf.... * use promise instead of callback accessor * attempt with disabled authent * enable security again * set mapping to dynamic before calling migration endpoint * rename space data from non-spaced tests * add documentation on the `rerun` flag * create router with the `/api/saved_objects` prefix * add unit test about strict mapping * add integration test on migrate endpoint * wrap route handler with handleLegacyErrors * add remark about limitations of the rerun flag * Apply suggestions from code review Co-Authored-By: Aleh Zasypkin <[email protected]> * do not return detailed index result * add access tag to migrate route * use /internal prefix for migrate endpoint * fix integ tests Co-authored-by: Elastic Machine <[email protected]> Co-authored-by: Josh Dover <[email protected]> Co-authored-by: Aleh Zasypkin <[email protected]>

…igrator (#56971) (#58575) * es-archiver call _migrate endpoint instead of creating a migrator * fix crlf.... * use promise instead of callback accessor * attempt with disabled authent * enable security again * set mapping to dynamic before calling migration endpoint * rename space data from non-spaced tests * add documentation on the `rerun` flag * create router with the `/api/saved_objects` prefix * add unit test about strict mapping * add integration test on migrate endpoint * wrap route handler with handleLegacyErrors * add remark about limitations of the rerun flag * Apply suggestions from code review Co-Authored-By: Aleh Zasypkin <[email protected]> * do not return detailed index result * add access tag to migrate route * use /internal prefix for migrate endpoint * fix integ tests Co-authored-by: Elastic Machine <[email protected]> Co-authored-by: Josh Dover <[email protected]> Co-authored-by: Aleh Zasypkin <[email protected]> Co-authored-by: Elastic Machine <[email protected]> Co-authored-by: Josh Dover <[email protected]> Co-authored-by: Aleh Zasypkin <[email protected]>

pgayvallet added Feature:New Platform Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v7.7.0 v8.0.0 labels Feb 6, 2020

pgayvallet added the release_note:skip Skip the PR/issue when compiling release notes label Feb 6, 2020

es-archiver call _migrate endpoint instead of creating a migrator

1943e3a

pgayvallet force-pushed the kbn-55860-es-archiver-call-migrate-endpoint branch from 294813f to 1943e3a Compare February 6, 2020 13:11

pgayvallet added 3 commits February 6, 2020 14:54

fix crlf....

b3d6019

use promise instead of callback accessor

b8820a0

attempt with disabled authent

110187e

This comment has been minimized.

Sign in to view

elastic deleted a comment from kibanamachine Feb 6, 2020

enable security again

ec1a4ab

This comment has been minimized.

Sign in to view

set mapping to dynamic before calling migration endpoint

d72d774

This comment has been minimized.

Sign in to view

rename space data from non-spaced tests

41f56b3

pgayvallet commented Feb 7, 2020

View reviewed changes

pgayvallet requested a review from spalger February 7, 2020 15:32

pgayvallet marked this pull request as ready for review February 7, 2020 15:32

pgayvallet requested review from a team as code owners February 7, 2020 15:32

Merge remote-tracking branch 'upstream/master' into kbn-55860-es-arch…

250ff7c

…iver-call-migrate-endpoint

joshdover reviewed Feb 10, 2020

View reviewed changes

pgayvallet added 2 commits February 13, 2020 14:11

Merge remote-tracking branch 'upstream/master' into kbn-55860-es-arch…

f62f9ae

…iver-call-migrate-endpoint

Merge remote-tracking branch 'upstream/master' into kbn-55860-es-arch…

b467cfa

…iver-call-migrate-endpoint

joshdover approved these changes Feb 14, 2020

View reviewed changes

Merge branch 'master' into kbn-55860-es-archiver-call-migrate-endpoint

2217872

rudolf approved these changes Feb 17, 2020

View reviewed changes

azasypkin self-requested a review February 17, 2020 12:30

azasypkin approved these changes Feb 17, 2020

View reviewed changes

joshdover self-assigned this Feb 18, 2020

Apply suggestions from code review

89ab46e

Co-Authored-By: Aleh Zasypkin <[email protected]>

azasypkin reviewed Feb 19, 2020

View reviewed changes

pgayvallet added 4 commits February 24, 2020 09:40

Merge remote-tracking branch 'upstream/master' into kbn-55860-es-arch…

0ea85e0

…iver-call-migrate-endpoint

do not return detailed index result

801189a

Merge remote-tracking branch 'refs/remotes/origin/kbn-55860-es-archiv…

10b8fac

…er-call-migrate-endpoint' into kbn-55860-es-archiver-call-migrate-endpoint

add access tag to migrate route

96b1315

joshdover removed their assignment Feb 24, 2020

pgayvallet added 4 commits February 24, 2020 17:35

Merge remote-tracking branch 'upstream/master' into kbn-55860-es-arch…

26a15e5

…iver-call-migrate-endpoint

Merge remote-tracking branch 'upstream/master' into kbn-55860-es-arch…

a79485d

…iver-call-migrate-endpoint

use /internal prefix for migrate endpoint

318fa14

fix integ tests

4fdb39d

Merge remote-tracking branch 'upstream/master' into kbn-55860-es-arch…

502c042

…iver-call-migrate-endpoint

pgayvallet merged commit a4e82c4 into elastic:master Feb 26, 2020

pgayvallet mentioned this pull request Feb 26, 2020

[7.x] Adapt es-archiver to call _migrate endpoint instead of creating a migrator (#56971) #58575

Merged

Adapt es-archiver to call _migrate endpoint instead of creating a migrator #56971

Adapt es-archiver to call _migrate endpoint instead of creating a migrator #56971

Conversation

pgayvallet commented Feb 6, 2020 • edited Loading

Summary

Checklist

For maintainers

elasticmachine commented Feb 6, 2020

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

elasticmachine commented Feb 7, 2020

💚 Build Succeeded

History

pgayvallet left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pgayvallet Feb 7, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joshdover commented Feb 14, 2020

joshdover left a comment

Choose a reason for hiding this comment

rudolf commented Feb 17, 2020

azasypkin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

azasypkin Feb 17, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

azasypkin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kobelb Feb 19, 2020 • edited Loading

Choose a reason for hiding this comment

kobelb Feb 20, 2020 • edited Loading

Choose a reason for hiding this comment

joshdover Feb 20, 2020 • edited Loading

Choose a reason for hiding this comment

azasypkin Feb 21, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pgayvallet commented Feb 25, 2020 • edited Loading

kibanamachine commented Feb 26, 2020

💚 Build Succeeded

History

Adapt es-archiver to call `_migrate` endpoint instead of creating a migrator #56971

Adapt es-archiver to call `_migrate` endpoint instead of creating a migrator #56971

pgayvallet commented Feb 6, 2020 •

edited

Loading

pgayvallet left a comment •

edited

Loading

pgayvallet Feb 7, 2020 •

edited

Loading

azasypkin Feb 17, 2020 •

edited

Loading

kobelb Feb 19, 2020 •

edited

Loading

kobelb Feb 20, 2020 •

edited

Loading

joshdover Feb 20, 2020 •

edited

Loading

azasypkin Feb 21, 2020 •

edited

Loading

pgayvallet commented Feb 25, 2020 •

edited

Loading