-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upgrades: failures during an auto upgrade do not leave observable artifacts #90148
Labels
A-cluster-upgrades
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Comments
fabiog1901
added
the
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
label
Oct 18, 2022
craig bot
pushed a commit
that referenced
this issue
Oct 18, 2022
90007: execbuilder: enforce_home_region should only apply to DML r=rytaft a=msirek Fixes #89875 Fixes #88789 This fixes a problem where the enforce_home_region session flag might cause non-DML statements to error out, such as SHOW CREATE, if those statements utilize scans or joins of multiregion tables. This also fixes issues with proper erroring out of mutation DML like UPDATE AND DELETE. For example, the following previously did not error: ``` CREATE TABLE messages_rbr ( account_id INT NOT NULL, message_id UUID DEFAULT gen_random_uuid(), message STRING NOT NULL, PRIMARY KEY (account_id), INDEX msg_idx(message) ) LOCALITY REGIONAL BY ROW; SET enforce_home_region = true; DELETE FROM messages_rbr WHERE message = 'Hello World!' ERROR: Query has no home region. Try adding a filter on messages_rbr.crdb_region and/or on key column (messages_rbr.account_id). SQLSTATE: XCHR2 ``` Release note (bug fix): This patch fixes an issue with the enforce_home_region session setting which may cause SHOW CREATE TABLE or other non-DML statements to error out if the optimizer plan for the statement involves accessing a multiregion table. 90106: kv: reacquire proscribed leases on drain, then transfer r=shralex a=nvanbenschoten Fixes #83372. Fixes #90022. Fixes #89963. Fixes #89962. This commit instructs stores to reacquire proscribed leases when draining in order to subsequently transfer them away. This addresses a source of flakiness in `transfer-lease` roachtests where some lease would not be transferred away before the drain completed. This could result in range unavailable for up to 9 seconds while other replicas waited out the lease'S expiration. This is because only the previous leaseholder knows that a proscribed lease is invalid. All other replicas still consider the lease to be valid. This failure mode was always present if a lease transfer failed during a drain. However, it became more likely with 034611b. With that change, we began rejecting lease transfers that were deemed to be "unsafe" more frequently. 034611b introduced a best-effort, graceful version of this check and an airtight, ungraceful version of the check. The former performs the check before revoking the outgoing leaseholder's lease while the latter performs the check after revoking the outgoing leaseholder's lease. In rare cases, it was possible to hit the airtight, ungraceful check and cause the lease to be proscribed. See #83261 (comment) for more details on how this led to test flakiness in the `transfer-lease` roachtest suite. Release notes: None. Release justification: Avoids GA-blocking roachtest failures. 90107: execbuilder: fix enforce_home_region erroring of input table to LOJ r=rytaft a=msirek Fixes #88788 This fixes erroring out of locality-optimized join when the input table's home region does not match the gateway region and session flag `enforce_home_region` is true. Release note (bug fix): This patch fixes detection and erroring out of queries using locality-optimized join when session setting enforce_home_region is true and the input table to the join has no home region or its home region does not match the gateway region. 90165: sql,server: increase severity of upgraded-related logging r=ajwerner a=knz Informs #90148. This increases the severity from INFO in the following cases: - in the case when `SET CLUSTER SETTING version` is issued from a SQL client (WARNING in case of failure). - in the case when the server spontaneously decides to upgrade in the background (ERROR in case of failure). Release note: None 90166: sql/rowenc: remove leftover log in test r=mgartner a=mgartner Epic: None Release note: None Co-authored-by: Mark Sirek <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Marcus Gartner <[email protected]>
knz
changed the title
logging for the cluster upgrades subsystem shows suspicious severity level
upgrades: failures during an auto upgrade does not leave observable artifacts
Oct 19, 2022
knz
changed the title
upgrades: failures during an auto upgrade does not leave observable artifacts
upgrades: failures during an auto upgrade do not leave observable artifacts
Oct 19, 2022
We have marked this issue as stale because it has been inactive for |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-cluster-upgrades
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Describe the problem
I initiated the upgrade of the cluster from 21.2.14 to 22.1.8 and the finalization part failed.
Upon research, there was a log entry with the description of the error shows a severity level of INFO (fixed in #90165).
We would also like to see some notification that it failed, for example as a failed job or some other item in DB Console.
@dt explains:
Details on Slack
Jira issue: CRDB-20615
The text was updated successfully, but these errors were encountered: