Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workload/schemachanger: re-enable the schema changer workload #88085

Merged
merged 5 commits into from
Sep 22, 2022

Conversation

fqazi
Copy link
Collaborator

@fqazi fqazi commented Sep 17, 2022

Previously the schema changer workload was disabled because of two intermittent issues:

  1. The workload itself had an issue due to a recent refactor that did not properly detect/handle potential execution errors. While the set of errors was generated it was never used properly in some cases after a recent refactor
  2. Some scenarios involving inserts could fail binding OID's inside the resolveOID code path, specifically this code path does not properly handle dropped descriptors. These changes enforces the old behaviour where the resolution should just fail.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@fqazi fqazi changed the title workload/schemachanger: fix handling of potential errors workload/schemachanger: fix detect of potential exec errors Sep 17, 2022
@fqazi fqazi force-pushed the insertFix branch 3 times, most recently from 666d255 to 525d162 Compare September 19, 2022 16:45
@fqazi fqazi changed the title workload/schemachanger: fix detect of potential exec errors workload/schemachanger: re-enable the schema changer workload Sep 19, 2022
Previously, when resolve OID ran into errors on dropped
descriptors we would surface internal inactive descriptor
errors to the caller. This was inadequate because these errors
would not be properly ignored leading to unexpected failures,
which did not exist on 22.1.
To address this, this patch adds logic to ignore inactive
descriptor errors like we would have in the past, returning
that no OID has been resolved.

Release note: None
Previously, when we refactored support for how
statement generation and errors, we did not correctly
update support for potential execution errors. As
a result support for inserts / add foreign key constraints
was regressed. This path, fixes the support for possible
execution errors in both cases.

Release note: None
Release Justification: Low changes to re-enable an existing test case
@fqazi fqazi requested a review from a team September 19, 2022 21:30
@fqazi fqazi marked this pull request as ready for review September 19, 2022 21:30
@fqazi fqazi requested a review from a team September 19, 2022 21:30
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still flakey due to at least two other issues I saw:

  1. Something about max command size for inverted index entries
  2. Integer overflows due to expressions

The second one popped up more regularly. Consider you have a table:

CREATE TABLE (i INT, INDEX ((i+i)); If i is large enough that 2*i overflows int64, there's a problem.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@fqazi
Copy link
Collaborator Author

fqazi commented Sep 19, 2022

@ajwerner For (2) We did have generated expression validation (see validateGeneratedExpressionsForInsert), but some of those errors were in the potential error sets. Let me re-run it a few more times to confirm to make sure it's fully squashed.

@fqazi
Copy link
Collaborator Author

fqazi commented Sep 19, 2022

I'll see if I can reproduce (1) and address that as well in a separate commit

@fqazi fqazi force-pushed the insertFix branch 2 times, most recently from fa1863e to be3557e Compare September 20, 2022 14:42
Previously, when a CTAS statement was executed we did
not correctly detect if the SELECT portion could fail
on generated columns. This could lead to test failures,
even though the observed errors are correct. To address,
this patch will execute the select independently to
validate any generated column errors.

Release note: None
@fqazi fqazi force-pushed the insertFix branch 4 times, most recently from af2c072 to 6ea5120 Compare September 21, 2022 16:41
Previously, the watch dog threads inside the schema
changer workload could dump the stacks way incorrectly,
since it didn't give connections time to terminate. This
change adapts the watch dog logic to have a delay.

Release note: None
Fixes: cockroachdb#87615

Previously, this test was no longer marked as a release
blocker because of how unstable it was. With the latest
set of fixes we expect this to be far more stable.

Release note: None
@fqazi fqazi requested a review from ajwerner September 22, 2022 02:02
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r1, 3 of 3 files at r2, 2 of 2 files at r3, 2 of 2 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @fqazi)

@fqazi
Copy link
Collaborator Author

fqazi commented Sep 22, 2022

@ajwerner TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Sep 22, 2022

Build succeeded:

@craig craig bot merged commit a33d71d into cockroachdb:release-22.2 Sep 22, 2022
@rafiss
Copy link
Collaborator

rafiss commented Sep 22, 2022

I see that this was merged directly into release-22.2 (kinda surprised bors worked). Is it also being added to master?

@fqazi
Copy link
Collaborator Author

fqazi commented Sep 22, 2022

@rafiss Yes, it will be merged into master separately.

craig bot pushed a commit that referenced this pull request Oct 28, 2022
88537: workload/schemachanger: re-enable the schema changer workload r=fqazi a=fqazi

Fixes: #82133

Previously the schema changer workload was disabled because of two intermittent issues:

The workload itself had an issue due to a recent refactor that did not properly detect/handle potential execution errors. While the set of errors was generated it was never used properly in some cases after a recent refactor
Some scenarios involving inserts could fail binding OID's inside the resolveOID code path, specifically this code path does not properly handle dropped descriptors. These changes enforces the old behaviour where the resolution should just fail.

Note: This is the same as the previously merged PR (#88085 on 22.2)

Co-authored-by: Faizan Qazi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants