Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: TestCCLLogic/fakedist-metadata/partitioning_enum is flaky #75227

Closed
mgartner opened this issue Jan 20, 2022 · 2 comments
Closed

sql: TestCCLLogic/fakedist-metadata/partitioning_enum is flaky #75227

mgartner opened this issue Jan 20, 2022 · 2 comments
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. skipped-test T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@mgartner
Copy link
Collaborator

This test flakes in CI and nightlies since #68395 (comment).

$ make stress PKG='./pkg/ccl/logictestccl' TESTS='TestCCLLogic/fakedist-metadata/partitioning_enum' TESTFLAGS='' TESTTIMEOUT=30s

0 runs so far, 0 failures, over 5s
22 runs so far, 0 failures, over 10s

I220120 17:17:30.587039 1 (gostd) rand.go:147  [-] 1  random seed: -3836311176851664413
=== RUN   TestCCLLogic
    test_log_scope.go:79: test logs captured to: /var/folders/3j/7c3w0jjd14g4n19lvvrxtm2r0000gp/T/logTest
CCLLogic2338896894
    test_log_scope.go:80: use -show-logs to present logs inline
=== RUN   TestCCLLogic/fakedist-metadata
=== RUN   TestCCLLogic/fakedist-metadata/partitioning_enum
=== PAUSE TestCCLLogic/fakedist-metadata/partitioning_enum
=== CONT  TestCCLLogic/fakedist-metadata/partitioning_enum
=== RUN   TestCCLLogic/fakedist-metadata/partitioning_enum/drop_enum_partitioning_value
    logic.go:1985:

        /Users/marcus/go/src/github.com/cockroachdb/cockroach/pkg/ccl/logictestccl/testdata/logic_test/pa
rtitioning_enum:92:
        expected "could not remove enum value \"a\" as it is being used in the partitioning of index tbl@
idx", but no error occurred
--- done: /Users/marcus/go/src/github.com/cockroachdb/cockroach/pkg/ccl/logictestccl/testdata/logic_test/
partitioning_enum with config fakedist-metadata: 15 tests, 1 failures
--- total progress: 15 statements/queries
--- total: 15 tests, 1 failures
=== CONT  TestCCLLogic
    logic.go:3604: -- test log scope end --
test logs left over in: /var/folders/3j/7c3w0jjd14g4n19lvvrxtm2r0000gp/T/logTestCCLLogic2338896894
--- FAIL: TestCCLLogic (2.94s)
    --- FAIL: TestCCLLogic/fakedist-metadata (0.00s)
        --- FAIL: TestCCLLogic/fakedist-metadata/partitioning_enum (2.84s)
            --- FAIL: TestCCLLogic/fakedist-metadata/partitioning_enum/drop_enum_partitioning_value (1.07
s)
FAIL

ERROR: exit status 1

40 runs completed, 1 failures, over 14s
context canceled
FAIL
FAIL    github.com/cockroachdb/cockroach/pkg/ccl/logictestccl   13.886s
FAIL
gmake: *** [Makefile:1087: stress] Error 1
@mgartner mgartner added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. skipped-test labels Jan 20, 2022
@blathers-crl blathers-crl bot added the T-sql-schema-deprecated Use T-sql-foundations instead label Jan 20, 2022
@mgartner mgartner changed the title sql: TestCCLLogic/fakedist-metadata/partitioning_enum is flakey sql: TestCCLLogic/fakedist-metadata/partitioning_enum is flaky Jan 20, 2022
mgartner added a commit to mgartner/cockroach that referenced this issue Jan 20, 2022
craig bot pushed a commit that referenced this issue Jan 21, 2022
74318: tracing: add /debug/tracez rendering the active spans  r=andreimatei a=andreimatei

`/debug/tracez` lets users take a snapshot of the active spans registry
and render the new snapshot, or one of the previously taken snapshots.
The Tracer can hold up to 10 snapshots in memory.

It looks like this:
![Screenshot from 2022-01-04 19-03-39](https://user-images.githubusercontent.com/377201/148140272-306658d5-5b9c-4f2a-b59c-28df9c5ed10c.png)


When visualizing a snapshot, the page lets you do a number of things:
1. List all the spans.
2. See the (current) stack trace for each span's goroutine (if the
   goroutine was still running at the time when the snapshot was
   captured). Stack traces can be toggled visible/hidden.
3. Sort the spans by name or start time.
4. Filter the span according to text search. The search works across
   the name and stack trace.
5. Go from a span to the full trace containing that span.   

For the table Javascript providing sorting and filtering, this patch
embeds the library from https://listjs.com/ .

Limitations:
- for now, only the registry of the local node is snapshotted. In the
  fuiture I'll collect info from all nodes.
- for now, the relationships between different spans are not represented
  in any way. I'll work on the ability to go from a span to the whole
  trace that the span is part of.
- for now, tags and structured and unstructured log messages that a span
  might have are not displayed in any way.

At the moment, span creation is not enabled in production by default
(i.e. the Tracer is put in `TracingModeOnDemand` by default, instead of
the required `TracingModeActiveSpansRegistry`). This patch does not change
that, so in order to benefit from /debug/tracez in all its glory, one
has to run with `COCKROACH_REAL_SPANS=1` for now. Not for long, though.

Release note: None

74867: sql: Support CREATE DATABASE WITH OWNER r=Fenil-P a=Fenil-P

fixes #67817

Release note (sql change): Allow users to specify the owner when creating a database. 
			                      Similar to postgresql: CREATE DATABASE name [ [ WITH ] [ OWNER [=] user_name ]



74871: sql: add a tracing tag with the txn ID r=andreimatei a=andreimatei

This patch adds the txn's ID as a tag to the tracing span representing a
SQL txn. I'm creating a UI to explore the current spans, and this ID
will make it easy to navigate between a query/request blocking on a lock
held by some other txn, and the activity of that other txn.

Release note: None

75114: sql: directly specify columns in TableReader r=RaduBerinde a=RaduBerinde

~Note: the first commit is #74922.~

The internal columns of the TableReader (as well as the row fetcher)
are all the columns of the table, with only a subset of values
actually produced. This design decision has been carried over way past
the point where it makes sense (I admit, it's questionable whether it
ever made sense). For one, "all the columns" is ambiguous (does it
contain non-public columns? does it include system columns?) leading
to various flags and inherent fragility. Second, it relies on the
execution engine to figure out (based on the PostProcessSpec) which
columns are actually needed, which the optimizer already figures out
for us now.

This commit changes the TableReader spec and the interface of
row.Fetcher to always produce a given specific set of column IDs. The
diagram for table readers now specifies the columns by name.

The JoinReader, InvertedJoiner, ZigzagJoiner are not changed in this
commit (but they should be cleaned up as well).

Release note: None


75175: colfetcher: fix the bytes read statistic collection r=yuzefovich a=yuzefovich

During 21.2 release we adjusted the `cFetcher` to be `Close`d eagerly
when it is returning the zero-length batch. This was done in order to
release some references in order for the memory to be GCed sooner;
additionally, the `cFetcher` started being used for the index join where
the fetcher is restarted from scratch for every batch of spans, so it
seemed reasonable to close it automatically.

However, that eager closure broke "bytes read" statistic collection
since the `row.KVFetcher` was responsible for providing it, and we were
zeroing it out. This commit fixes this problem by the `cFetcher`
memorizing the number of bytes it has read in `Close`. Some care needs
to be taken to not double-count the bytes read in the index join, so
a couple of helper methods have been introduced.

Additionally this commit applies the same eager-close optimization to
the `cFetcher` when the last batch is returned (which makes it so that
if we've just exhausted all KVs, we close the fetcher - previously, we
would set the zero length on the batch and might never get into
`stateFinished`).

Fixes: #75128.

Release note (bug fix): Previously, CockroachDB could incorrectly report
`KV bytes read` statistic in `EXPLAIN ANALYZE` output. The bug is
present only in 21.2.x versions.

75215: cmd/github-post: fix Pebble metamorphic reproduction command r=jbowens a=jbowens

When posting a github issue for a Pebble metamorphic test failure, include the
correct `-ops` flag.

Discovered because cockroachdb/pebble#1459 contained a
reproduction command that contained too few ops to reproduce the issue.

Release note: none

75228: logictestccl: skip flaky TestCCLLogic/fakedist-metadata/partitioning_enum r=mgartner a=mgartner

Informs #75227

Release note: None

75237: cli,rpc: don't check the active cluster version in the CLI r=andreimatei a=knz

This commit removes a code path that would tickle an assertion failure
if we were to later fix the context propagation in the RPC heartbeat
method (see PR #71243): there's no "active cluster version" in the CLI
and so we can't compare it in a client interceptor.

Release note: None

75254: scripts: add `dev generate --mirror` to `bump-pebble.sh` script r=jbowens a=nicktrav

CI now expects that dependencies are mirrored to cloud storage and will
fail if the TODO for mirroring the repo is left unaddressed in the
`DEPS.bzl` file.

Add a mirroring step to the `bump-pebble.sh` script.

Release note: none

Co-authored-by: Andrei Matei <[email protected]>
Co-authored-by: Fenil Patel <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Jackson Owens <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Nick Travers <[email protected]>
gtr pushed a commit to gtr/cockroach that referenced this issue Jan 24, 2022
@irfansharif
Copy link
Contributor

Should this be closed after #75300? +cc @ajwerner.

@ajwerner
Copy link
Contributor

Yes

@exalate-issue-sync exalate-issue-sync bot added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-schema-deprecated Use T-sql-foundations instead labels May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. skipped-test T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
None yet
Development

No branches or pull requests

3 participants