Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: clean up mutable not-null columns hack #74922

Merged
merged 1 commit into from
Jan 19, 2022

Conversation

RaduBerinde
Copy link
Member

@RaduBerinde RaduBerinde commented Jan 18, 2022

Mutation columns in some cases need to be scanned even if they haven't
been backfilled yet, which means that we may retrieve NULL values even
if they are marked as not-nullable.

We currently have a hack in the table descriptor which changes the
nullable flags in the column descriptors when ReadableColumns() is
used. It is very surprising that we can get different descriptors for
a given ColumnID depending if we look for it in ReadableColumns() or
in AllColumns() (e.g. via FindColumnWithID).

This commit cleans this up, changing the scanning code to check for
Public() instead.

Release note: None

@RaduBerinde RaduBerinde requested review from postamar, yuzefovich and a team January 18, 2022 02:16
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'll defer to @postamar for approval.

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @postamar and @RaduBerinde)


-- commits, line 6 at r1:
nit: s/market/marked/.


-- commits, line 12 at r1:
nit: missing closing parenthesis.


pkg/sql/catalog/tabledesc/column.go, line 318 at r1 (raw file):

		c.nonDrop = c.public
	} else {
		//readableDescs := make([]descpb.ColumnDescriptor, 0, numMutations)

Should this be now removed?

Mutation columns in some cases need to be scanned even if they haven't
been backfilled yet, which means that we may retrieve NULL values even
if they are marked as not-nullable.

We currently have a hack in the table descriptor which changes the
nullable flags in the column descriptors when `ReadableColumns()` is
used. It is very surprising that we can get different descriptors for
a given ColumnID depending if we look for it in `ReadableColumns()` or
in `AllColumns()` (e.g. via FindColumnWithID).

This commit cleans this up, changing the scanning code to check for
`Public()` instead.

Release note: None
Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @postamar and @yuzefovich)


pkg/sql/catalog/tabledesc/column.go, line 318 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Should this be now removed?

Oops, done.

Copy link
Collaborator

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

le descriptor which changes the
nullable flags in the column descr
LGTM. Much cleaner than the original hacky code.

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @yuzefovich)

Copy link
Collaborator

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Much cleaner than the original hacky code.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @yuzefovich)

@RaduBerinde
Copy link
Member Author

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Jan 19, 2022

Build succeeded:

@craig craig bot merged commit eb26eb3 into cockroachdb:master Jan 19, 2022
@RaduBerinde RaduBerinde deleted the remove-readable-hack branch January 19, 2022 20:48
@postamar
Copy link
Contributor

Thanks for doing this!

craig bot pushed a commit that referenced this pull request Jan 21, 2022
74318: tracing: add /debug/tracez rendering the active spans  r=andreimatei a=andreimatei

`/debug/tracez` lets users take a snapshot of the active spans registry
and render the new snapshot, or one of the previously taken snapshots.
The Tracer can hold up to 10 snapshots in memory.

It looks like this:
![Screenshot from 2022-01-04 19-03-39](https://user-images.githubusercontent.com/377201/148140272-306658d5-5b9c-4f2a-b59c-28df9c5ed10c.png)


When visualizing a snapshot, the page lets you do a number of things:
1. List all the spans.
2. See the (current) stack trace for each span's goroutine (if the
   goroutine was still running at the time when the snapshot was
   captured). Stack traces can be toggled visible/hidden.
3. Sort the spans by name or start time.
4. Filter the span according to text search. The search works across
   the name and stack trace.
5. Go from a span to the full trace containing that span.   

For the table Javascript providing sorting and filtering, this patch
embeds the library from https://listjs.com/ .

Limitations:
- for now, only the registry of the local node is snapshotted. In the
  fuiture I'll collect info from all nodes.
- for now, the relationships between different spans are not represented
  in any way. I'll work on the ability to go from a span to the whole
  trace that the span is part of.
- for now, tags and structured and unstructured log messages that a span
  might have are not displayed in any way.

At the moment, span creation is not enabled in production by default
(i.e. the Tracer is put in `TracingModeOnDemand` by default, instead of
the required `TracingModeActiveSpansRegistry`). This patch does not change
that, so in order to benefit from /debug/tracez in all its glory, one
has to run with `COCKROACH_REAL_SPANS=1` for now. Not for long, though.

Release note: None

74867: sql: Support CREATE DATABASE WITH OWNER r=Fenil-P a=Fenil-P

fixes #67817

Release note (sql change): Allow users to specify the owner when creating a database. 
			                      Similar to postgresql: CREATE DATABASE name [ [ WITH ] [ OWNER [=] user_name ]



74871: sql: add a tracing tag with the txn ID r=andreimatei a=andreimatei

This patch adds the txn's ID as a tag to the tracing span representing a
SQL txn. I'm creating a UI to explore the current spans, and this ID
will make it easy to navigate between a query/request blocking on a lock
held by some other txn, and the activity of that other txn.

Release note: None

75114: sql: directly specify columns in TableReader r=RaduBerinde a=RaduBerinde

~Note: the first commit is #74922.~

The internal columns of the TableReader (as well as the row fetcher)
are all the columns of the table, with only a subset of values
actually produced. This design decision has been carried over way past
the point where it makes sense (I admit, it's questionable whether it
ever made sense). For one, "all the columns" is ambiguous (does it
contain non-public columns? does it include system columns?) leading
to various flags and inherent fragility. Second, it relies on the
execution engine to figure out (based on the PostProcessSpec) which
columns are actually needed, which the optimizer already figures out
for us now.

This commit changes the TableReader spec and the interface of
row.Fetcher to always produce a given specific set of column IDs. The
diagram for table readers now specifies the columns by name.

The JoinReader, InvertedJoiner, ZigzagJoiner are not changed in this
commit (but they should be cleaned up as well).

Release note: None


75175: colfetcher: fix the bytes read statistic collection r=yuzefovich a=yuzefovich

During 21.2 release we adjusted the `cFetcher` to be `Close`d eagerly
when it is returning the zero-length batch. This was done in order to
release some references in order for the memory to be GCed sooner;
additionally, the `cFetcher` started being used for the index join where
the fetcher is restarted from scratch for every batch of spans, so it
seemed reasonable to close it automatically.

However, that eager closure broke "bytes read" statistic collection
since the `row.KVFetcher` was responsible for providing it, and we were
zeroing it out. This commit fixes this problem by the `cFetcher`
memorizing the number of bytes it has read in `Close`. Some care needs
to be taken to not double-count the bytes read in the index join, so
a couple of helper methods have been introduced.

Additionally this commit applies the same eager-close optimization to
the `cFetcher` when the last batch is returned (which makes it so that
if we've just exhausted all KVs, we close the fetcher - previously, we
would set the zero length on the batch and might never get into
`stateFinished`).

Fixes: #75128.

Release note (bug fix): Previously, CockroachDB could incorrectly report
`KV bytes read` statistic in `EXPLAIN ANALYZE` output. The bug is
present only in 21.2.x versions.

75215: cmd/github-post: fix Pebble metamorphic reproduction command r=jbowens a=jbowens

When posting a github issue for a Pebble metamorphic test failure, include the
correct `-ops` flag.

Discovered because cockroachdb/pebble#1459 contained a
reproduction command that contained too few ops to reproduce the issue.

Release note: none

75228: logictestccl: skip flaky TestCCLLogic/fakedist-metadata/partitioning_enum r=mgartner a=mgartner

Informs #75227

Release note: None

75237: cli,rpc: don't check the active cluster version in the CLI r=andreimatei a=knz

This commit removes a code path that would tickle an assertion failure
if we were to later fix the context propagation in the RPC heartbeat
method (see PR #71243): there's no "active cluster version" in the CLI
and so we can't compare it in a client interceptor.

Release note: None

75254: scripts: add `dev generate --mirror` to `bump-pebble.sh` script r=jbowens a=nicktrav

CI now expects that dependencies are mirrored to cloud storage and will
fail if the TODO for mirroring the repo is left unaddressed in the
`DEPS.bzl` file.

Add a mirroring step to the `bump-pebble.sh` script.

Release note: none

Co-authored-by: Andrei Matei <[email protected]>
Co-authored-by: Fenil Patel <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Jackson Owens <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Nick Travers <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants