Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2 migrations should exit process on corrupt saved object document #91465

Merged
merged 5 commits into from
Feb 22, 2021
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -722,7 +722,7 @@ describe('DocumentMigrator', () => {
});
});

it('logs the document and transform that failed', () => {
it('logs the original error and throws a transform error if a document transform fails', () => {
const log = mockLogger;
const migrator = new DocumentMigrator({
...testOpts(),
Expand All @@ -747,10 +747,13 @@ describe('DocumentMigrator', () => {
migrator.migrate(_.cloneDeep(failedDoc));
expect('Did not throw').toEqual('But it should have!');
} catch (error) {
expect(error.message).toMatch(/Dang diggity!/);
const warning = loggingSystemMock.collect(mockLoggerFactory).warn[0][0];
expect(warning).toContain(JSON.stringify(failedDoc));
expect(warning).toContain('dog:1.2.3');
expect(error.message).toMatchInlineSnapshot(`
"Failed to transform document smelly. Transform: dog:1.2.3
Doc: {\\"id\\":\\"smelly\\",\\"type\\":\\"dog\\",\\"attributes\\":{},\\"migrationVersion\\":{}}"
`);
expect(loggingSystemMock.collect(mockLoggerFactory).error[0][0]).toMatchInlineSnapshot(
`[Error: Dang diggity!]`
);
}
});

Expand Down Expand Up @@ -779,7 +782,7 @@ describe('DocumentMigrator', () => {
};
migrator.migrate(doc);
expect(loggingSystemMock.collect(mockLoggerFactory).info[0][0]).toEqual(logTestMsg);
expect(loggingSystemMock.collect(mockLoggerFactory).warn[1][0]).toEqual(logTestMsg);
expect(loggingSystemMock.collect(mockLoggerFactory).warn[0][0]).toEqual(logTestMsg);
});

test('extracts the latest migration version info', () => {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -678,10 +678,11 @@ function wrapWithTry(
} catch (error) {
const failedTransform = `${type.name}:${version}`;
const failedDoc = JSON.stringify(doc);
log.warn(
log.error(error);

throw new Error(
`Failed to transform document ${doc?.id}. Transform: ${failedTransform}\nDoc: ${failedDoc}`
);
throw error;
}
};
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,12 +58,12 @@ describe('migrateRawDocs', () => {
expect(transform).toHaveBeenNthCalledWith(2, obj2);
});

test('passes invalid docs through untouched and logs error', async () => {
test('throws when encountering a corrupt saved object document', async () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that v1 migrations will also start failing on corrupt saved objects. When v1 migrations encounter a corrupt saved object it will perform a migration on every restart which can cause data loss in a multi-instance Kibana setup if only one Kibana gets restarted at a time. So although it might block some users from upgrading, deleting the saved object will save them from worse problems.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When v1 migrations encounter a corrupt saved object it will perform a migration on every restart which can cause data loss in a multi-instance Kibana setup if only one Kibana gets restarted at a time.

Isn't this our recommended way to upgrade kibana? restarting one instance at a time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, all kibana's need to be shutdown before upgrading otherwise v1 migrations will cause data loss.

from https://www.elastic.co/guide/en/kibana/current/upgrade.html

Shut down all Kibana instances. Running more than one Kibana version against the same Elasticseach index is unsupported. Upgrading while older Kibana instances are running can cause data loss or upgrade failures.

const logger = createSavedObjectsMigrationLoggerMock();
const transform = jest.fn<any, any>((doc: any) => [
set(_.cloneDeep(doc), 'attributes.name', 'TADA'),
]);
const result = await migrateRawDocs(
const result = migrateRawDocs(
new SavedObjectsSerializer(new SavedObjectTypeRegistry()),
transform,
[
Expand All @@ -73,25 +73,11 @@ describe('migrateRawDocs', () => {
logger
);

expect(result).toEqual([
{ _id: 'foo:b', _source: { type: 'a', a: { name: 'AAA' } } },
{
_id: 'c:d',
_source: { type: 'c', c: { name: 'TADA' }, migrationVersion: {}, references: [] },
},
]);

const obj2 = {
id: 'd',
type: 'c',
attributes: { name: 'DDD' },
migrationVersion: {},
references: [],
};
expect(transform).toHaveBeenCalledTimes(1);
expect(transform).toHaveBeenCalledWith(obj2);
expect(result).rejects.toMatchInlineSnapshot(
`[Error: Unable to migrate the corrupt saved object document with _id: 'foo:b'.]`
);

expect(logger.error).toBeCalledTimes(1);
expect(transform).toHaveBeenCalledTimes(0);
});

test('handles when one document is transformed into multiple documents', async () => {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ export async function migrateRawDocs(
const migrateDocWithoutBlocking = transformNonBlocking(migrateDoc);
const processedDocs = [];
for (const raw of rawDocs) {
const options = { namespaceTreatment: 'lax' as 'lax' };
const options = { namespaceTreatment: 'lax' as const };
if (serializer.isRawSavedObject(raw, options)) {
const savedObject = serializer.rawToSavedObject(raw, options);
savedObject.migrationVersion = savedObject.migrationVersion || {};
Expand All @@ -48,11 +48,9 @@ export async function migrateRawDocs(
)
);
} else {
log.error(
`Error: Unable to migrate the corrupt Saved Object document ${raw._id}. To prevent Kibana from performing a migration on every restart, please delete or fix this document by ensuring that the namespace and type in the document's id matches the values in the namespace and type fields.`,
{ rawDocument: raw }
throw new Error(
`Unable to migrate the corrupt saved object document with _id: '${raw._id}'.`
);
processedDocs.push(raw);
}
}
return processedDocs;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -330,7 +330,7 @@ describe('KibanaMigrator', () => {
const migrator = new KibanaMigrator(options);
migrator.prepareMigrations();
await expect(migrator.runMigrations()).rejects.toMatchInlineSnapshot(`
[Error: Unable to complete saved object migrations for the [.my-index] index. Please check the health of your Elasticsearch cluster and try again. Error: Reindex failed with the following error:
[Error: Unable to complete saved object migrations for the [.my-index] index. Error: Reindex failed with the following error:
{"_tag":"Some","value":{"type":"elatsicsearch_exception","reason":"task failed with an error"}}]
`);
expect(loggingSystemMock.collect(options.logger).error[0][0]).toMatchInlineSnapshot(`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -314,20 +314,25 @@ describe('migrationsStateActionMachine', () => {
next: () => {
throw new ResponseError(
elasticsearchClientMock.createApiResponse({
body: { error: { type: 'snapshot_in_progress_exception', reason: 'error reason' } },
body: {
error: {
type: 'snapshot_in_progress_exception',
reason: 'Cannot delete indices that are being snapshotted',
},
},
})
);
},
})
).rejects.toMatchInlineSnapshot(
`[Error: Unable to complete saved object migrations for the [.my-so-index] index. Please check the health of your Elasticsearch cluster and try again. ResponseError: snapshot_in_progress_exception]`
`[Error: Unable to complete saved object migrations for the [.my-so-index] index. Please check the health of your Elasticsearch cluster and try again. Error: [snapshot_in_progress_exception]: Cannot delete indices that are being snapshotted]`
);
expect(loggingSystemMock.collect(mockLogger)).toMatchInlineSnapshot(`
Object {
"debug": Array [],
"error": Array [
Array [
"[.my-so-index] [snapshot_in_progress_exception]: error reason",
"[.my-so-index] [snapshot_in_progress_exception]: Cannot delete indices that are being snapshotted",
],
Array [
"[.my-so-index] migration failed, dumping execution log:",
Expand All @@ -352,7 +357,7 @@ describe('migrationsStateActionMachine', () => {
},
})
).rejects.toMatchInlineSnapshot(
`[Error: Unable to complete saved object migrations for the [.my-so-index] index. Please check the health of your Elasticsearch cluster and try again. Error: this action throws]`
`[Error: Unable to complete saved object migrations for the [.my-so-index] index. Error: this action throws]`
);
expect(loggingSystemMock.collect(mockLogger)).toMatchInlineSnapshot(`
Object {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -153,12 +153,27 @@ export async function migrationStateActionMachine({
logger.error(
logMessagePrefix + `[${e.body?.error?.type}]: ${e.body?.error?.reason ?? e.message}`
);
dumpExecutionLog(logger, logMessagePrefix, executionLog);
throw new Error(
`Unable to complete saved object migrations for the [${
initialState.indexPrefix
}] index. Please check the health of your Elasticsearch cluster and try again. Error: [${
e.body?.error?.type
}]: ${e.body?.error?.reason ?? e.message}`
);
} else {
logger.error(e);

dumpExecutionLog(logger, logMessagePrefix, executionLog);
if (e.message.startsWith('Unable to migrate the corrupt saved object document')) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: I would use error sub-classing instead of relying on an error message for the handling behavior change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I was being lazy :(

throw new Error(
`${e.message} To allow migrations to proceed, please delete this document from the [${initialState.indexPrefix}_${initialState.kibanaVersion}_001] index.`
);
}

throw new Error(
`Unable to complete saved object migrations for the [${initialState.indexPrefix}] index. ${e}`
);
}
dumpExecutionLog(logger, logMessagePrefix, executionLog);
throw new Error(
`Unable to complete saved object migrations for the [${initialState.indexPrefix}] index. Please check the health of your Elasticsearch cluster and try again. ${e}`
);
}
}