Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix/Backport to v15: Fix schema migrations requested_timestamp zero values #12263

Conversation

shlomi-noach
Copy link
Contributor

Fixes #12261

The fix in #12262 cannot be backported because of how internal schema management was redesigned in #11520

The solution in this PR is to fix the actual values. They should not be zero. They should have some timestamp value, which we borrow from added_timestamp column.

An UPDATE query runs on PRIMARY tablet initialization, just after table is created and before the rest of DDLs kick. So it runs once in the lifetime of the tablet. It's a full table scan query, but _vt.schema_migrations is never too large.

Description

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

@shlomi-noach shlomi-noach added Type: Bug Component: Online DDL Online DDL (vitess/native/gh-ost/pt-osc) labels Feb 7, 2023
@vitess-bot vitess-bot bot added NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says labels Feb 7, 2023
@vitess-bot
Copy link
Contributor

vitess-bot bot commented Feb 7, 2023

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.
  • If a test is added or modified, there should be a documentation on top of the test to explain what the expected behavior is what the test does.

If a new flag is being introduced:

  • Is it really necessary to add this flag?
  • Flag names should be clear and intuitive (as far as possible)
  • Help text should be descriptive.
  • Flag names should use dashes (-) as word separators rather than underscores (_).

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow should be required, the maintainer team should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should include a link to an issue that describes the bug.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from VTop, if used there.

@mattlord
Copy link
Contributor

mattlord commented Feb 7, 2023

I don't think we want this as:

  1. It's a temporary fix relatively late in the major release (15.0.3 is next) and there may not be a later release to undo it
    • We don't require incrementally stepped upgrades to each patch release within a major release (should be able to go from 15.0 to 15.4)
  2. We'd have to go all the way back to 13.0 with this

I'm not sure we need to do anything for the older releases — since I'm not even sure that OnlineDDL was GA/production-ready (in v14) back when the original bug that led to the 0 date values for the schema_migration.request_submitted field existed? — but if so, I feel like we should instead temporarily set a permissive SQL_MODE in WithDDL.unify(), similar to what we do in VReplication here: https://github.com/vitessio/vitess/blob/release-15.0/go/vt/vttablet/tabletmanager/vreplication/vreplicator.go

I do agree that we should implement a more general solution in the new sidecardb package to prevent similar things in the future.

@shlomi-noach
Copy link
Contributor Author

The upstream fix is #12262

GA was production ready in v14, per https://vitess.io/blog/2022-06-28-announcing-vitess-14/

I'm open to all options.

@deepthi
Copy link
Member

deepthi commented Feb 16, 2023

@mattlord a compromise would be

  • accept this PR for 15.0.3
  • don't try to undo it for a future 15.0 patch release. That way people can upgrade from 14.0.3 to 15.0.4 and things will still work

Re (2), why would we have to go back all the way to v13? In any case, which releases we fix this issue in is orthogonal to how we fix it. Whether it is a fixup query like this PR, or a different fix using WithDDL.

This seems like a lower-impact fix for a patch release than using WithDDL.

@shlomi-noach
Copy link
Contributor Author

We need a resolution whether this should or should not go into v15 🙏

@frouioui frouioui mentioned this pull request Mar 22, 2023
29 tasks
@frouioui
Copy link
Member

Hello @shlomi-noach, does this PR must go onto this week's v15.0.3 patch release?

Copy link
Contributor

@rohit-nayak-ps rohit-nayak-ps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Echoing @deepthi and @shlomi-noach's comments, this looks like a practical low impact solution to solve potential blockers for existing 15.0 clusters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Online DDL Online DDL (vitess/native/gh-ost/pt-osc) NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says Type: Bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants