Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spanconfigsqlwatcherccl: TestSQLWatcherMultiple gets confused watching over secondary tenants #106821

Closed
knz opened this issue Jul 14, 2023 · 2 comments · Fixed by #107760
Closed
Assignees
Labels
A-multitenancy Related to multi-tenancy C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. db-cy-23 T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@knz
Copy link
Contributor

knz commented Jul 14, 2023

Informs #76378

Describe the problem

When TestSQLWatcherMultiple is run over a secondary tenant, it fails some of its conditions, e.g.:

    sqlwatcher_test.go:345:
                Error Trace:    pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/spanconfigsqlwatcherccl_test/pkg/ccl/spanconfigccl/spanconfigsqlwatcherccl/sqlwatcher_test.go:345
                                                        GOROOT/src/runtime/asm_amd64.s:1594
                Error:          Not equal:
                                expected: 1
                                actual  : 0
                Test:           TestSQLWatcherMultiple

How to reproduce

Replace TestDoesNotWorkWithSecondaryTenantsButWeDontKnowWhyYet by TestTenantAlwaysEnabled and run the test.

Expected solution

The test code should function and succeeds whether or not the watcher is run over a secondary tenant.

Jira issue: CRDB-29730

@knz knz added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-multitenancy Related to multi-tenancy labels Jul 14, 2023
@knz
Copy link
Contributor Author

knz commented Jul 14, 2023

Similarly TestSQLWatcherOnEventError hangs in that case.

The other watcher tests are similarly affected.

knz added a commit to knz/cockroach that referenced this issue Jul 14, 2023
There's a mix of tests that control their tenants directly,
and tests that should really work with virtualization
enabled but don't.

Followup issues: cockroachdb#106821 and cockroachdb#106818.

Release note: None
knz added a commit to knz/cockroach that referenced this issue Jul 14, 2023
There's a mix of tests that control their tenants directly,
and tests that should really work with virtualization
enabled but don't.

Followup issues: cockroachdb#106821 and cockroachdb#106818.

Release note: None
@knz knz added A-kv Anything in KV that doesn't belong in a more specific category. T-kv KV Team labels Jul 14, 2023
craig bot pushed a commit that referenced this issue Jul 14, 2023
106738: logic: skip_on_retry works when errors are expected r=Xiang-Gu a=Xiang-Gu

Previously, we have `skip_on_retry` directive for logic test which, when set, it skips the rest of test if a statement fails with TransactionRetryError. However, it won't skip if the statement is expected to fail with certain error message. This PR ensures that whenever we have a TransactionRetryError and `skip_on_retry` is set, we always skip the rest of the test, even if the stmt is expected to fail.

fixes #104464

Release note: None

106759: streamingccl: unskip TestStreamDeleteRange r=msbutler a=stevendanna

This test had previously timed out. The timeout we saw was the result of a couple of issues.

When waiting for all delete ranges, our loop exit condition was very strict. We would only stop looking for rows if the number of delete ranges was exactly 3. If, however, we got 4 delete ranges, with 2 coming in a single batch, we would never hit this condition.

How would that happen though? One possibility are rangefeed duplicates. Another, and what appears to have been happening in this test, is that the representation of the range deletes observed by the rangefeed consumer is slightly different depending on whether the range delete is delivered as part of a catch-up scan or as part of the rangefeeds steady state. I believe this is because the range deletes overlap but are issued at different time points.  When we get them as part of the steady state, we get a trimmed version of the original event. When we get them as part of the catch-up scan, we get them broke up at the point of overlap.

Fixes #93568

Epic: none

Release note: None

106814: testutils: add helper to target transactions for retries r=lidorcarmel a=stevendanna

This helper makes it a little quicker to write a test that tests whether a particular transaction is retry safe.

Informs #106417

Epic: none

Release note: none

106822: spanconfigccl: remove uses of `TODOTestTenantDisabled` r=stevendanna a=knz

Informs #76378 .
Epic: CRDB-18499

There's a mix of tests that control their tenants directly, and tests that should really work with virtualization enabled but don't.

Followup issues: #106821 and #106818.

Release note: None

106832: server: bark loudly if the test tenant cannot be created r=herkolategan a=knz

Informs #76378 
Informs #103772. 
Epic: CRDB-18499

For context, the automatic test tenant machinery is currently dependent on a CCL enterprise license check.
(This is fundamentally not necessary - see #103772 - but sadly this is the way it is for now)

Prior to this patch, if the user or a test selected the creation of a test tenant, but the test code forgot to import the required CCL go package, the framework would announce that "a test tenant was created" but it was actually silently failing to do so.

This led to confusing investigations where a test tenant was expected, a test was appearing to succeed, but with a release build the same condition would fail.

This commit enhances the situation by ensuring we have clear logging output when the test tenant cannot be created due to the missing CCL import.

Release note: None

Co-authored-by: Xiang Gu <[email protected]>
Co-authored-by: Steven Danna <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
msbutler pushed a commit to msbutler/cockroach that referenced this issue Jul 17, 2023
There's a mix of tests that control their tenants directly,
and tests that should really work with virtualization
enabled but don't.

Followup issues: cockroachdb#106821 and cockroachdb#106818.

Release note: None
@arulajmani
Copy link
Collaborator

Given this is in the SQLWatcher, which is owned by SQL foundations, I'm going to move it to their triage queue and un-assign KV.

@blathers-crl blathers-crl bot added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label Jul 20, 2023
@arulajmani arulajmani removed the A-kv Anything in KV that doesn't belong in a more specific category. label Jul 20, 2023
@rafiss rafiss removed the T-kv KV Team label Jul 25, 2023
@rafiss rafiss self-assigned this Jul 25, 2023
@craig craig bot closed this as completed in 167da65 Jul 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multitenancy Related to multi-tenancy C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. db-cy-23 T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants