Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: expose always-on tracing inflight registry using vtable and builtins #55733

Closed
tbg opened this issue Oct 20, 2020 · 6 comments
Closed

sql: expose always-on tracing inflight registry using vtable and builtins #55733

tbg opened this issue Oct 20, 2020 · 6 comments
Assignees
Labels
A-kv-observability A-sql-observability Related to observability of the SQL layer A-tracing Relating to tracing in CockroachDB. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@tbg
Copy link
Member

tbg commented Oct 20, 2020

Is your feature request related to a problem? Please describe.
As part of always-on tracing, a node-local registry of inflight operations will be introduced (#55592). This registry would be most useful if it could be queried from SQL.

This would improve our troubleshooting story as we could filter on long-running KV trace spans in that table to easily detect hanging operations (unavailable range, heavy contention) in a self-service fashion.

Describe the solution you'd like
Introduce a virtual table crdb_internal.tracing_inflight (big question mark on the name) that pulls the node-local registry from all nodes in the cluster.

Add a view on that table that also translate (where possible) KV txn id -> SQL txn ID to allow filtering efficiently.

Describe alternatives you've considered

Additional context

@tbg tbg added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-observability A-sql-observability Related to observability of the SQL layer labels Oct 20, 2020
@tbg
Copy link
Member Author

tbg commented Jan 7, 2021

For starters, we should probably align closely with the schema of crdb_internal.session_tracing:

var crdbInternalSessionTraceTable = virtualSchemaTable{
comment: `session trace accumulated so far (RAM)`,
schema: `
CREATE TABLE crdb_internal.session_trace (
span_idx INT NOT NULL, -- The span's index.
message_idx INT NOT NULL, -- The message's index within its span.
timestamp TIMESTAMPTZ NOT NULL,-- The message's timestamp.
duration INTERVAL, -- The span's duration. Set only on the first
-- (dummy) message on a span.
-- NULL if the span was not finished at the time
-- the trace has been collected.
operation STRING NULL, -- The span's operation.
loc STRING NOT NULL, -- The file name / line number prefix, if any.
tag STRING NOT NULL, -- The logging tag, if any.
message STRING NOT NULL, -- The logged message.
age INTERVAL NOT NULL -- The age of this message relative to the beginning of the trace.
)`,
populate: func(ctx context.Context, p *planner, _ *dbdesc.Immutable, addRow func(...tree.Datum) error) error {
rows, err := p.ExtendedEvalContext().Tracing.getSessionTrace()
if err != nil {
return err
}
for _, r := range rows {
if err := addRow(r[:]...); err != nil {
return err
}
}
return nil
},
}

However, since all spans are unfinished, we will want the duration column to be populated with time.Since(spanStart). This might mean that we need to change (Span).GetRecording(); I believe at the moment duration is returned as -1 for unfinished Spans.
We want to do this since a common expected usage pattern is to check for long-running spans via a filter on duration > 1h or something like that.

@tbg
Copy link
Member Author

tbg commented Jan 20, 2021

Discussed in pod: we'll go with a vtable that lists only the spans + some of their metadata (not their recordings) that in particular allows identifying a span as long-running, and then add a builtin that renders a session tracing like table for a particular traceID.

tbg added a commit to tbg/cockroach that referenced this issue Jan 22, 2021
Once we have cockroachdb#55733 and cockroachdb#59200, it will be very useful to be able to
join together, on each node, the long-running trace *stacks*.

To this end, a `crdb_internal.node_stacks` table is added.

Release note: None
@tbg
Copy link
Member Author

tbg commented Jan 25, 2021

#59370 shows that there is value in adding a check on this vtable to roachtest. We'd fail the test if the vtable shows any long-running local root spans that have children attached to them.

craig bot pushed a commit that referenced this issue Feb 2, 2021
59492: sql,util: add vtable for node-local inflight span registry r=knz a=angelapwen

This change adds a virtual table, `crdb_internal.node_inflight_trace_spans`
that exposes data on currently inflight spans in the current node.
The table includes information on the span IDs, parent span IDs, trace IDs,
start time, duration (measured by time of collection - start time of the
span), and the span's operation.

Part 1 of addressing #55733. Will follow up with a built-in that will
allow querying of a particular span ID to surface more detailed
recording information.


Release note (sql change): Adds a virtual table,
`crdb_internal.node_inflight_trace_spans`, which exposes span ID, parent span
ID, trace ID, start time, duration, and operation of node-local inflight spans.

Co-authored-by: angelapwen <[email protected]>
@irfansharif
Copy link
Contributor

For posterity, what's remaining here are built-ins that:

  1. get the traceid/spanid of the currently open txn (if any)
  2. renders the payloads (just payloads) given a traceid/spanid

@irfansharif irfansharif changed the title sql: expose always-on tracing inflight registry as vtable sql: expose always-on tracing inflight registry using vtable and builtins Feb 3, 2021
tbg added a commit to tbg/cockroach that referenced this issue Feb 15, 2021
1. Require admin role for `crdb_internal.node_inflight_trace_spans`
2. Add num_structured column to that table (this can be seen as
   temporary for testing; we can remove it once cockroachdb#55733 is fully
   addressed and payloads can be introspected separately).
3. Add workaround for failure to finish sql txn spans (cockroachdb#59315)
   for use in logic tests, which together with 2.) and 4.) enables
   a better `contention_event` test.
4. Add `crdb_internal.trace_id` builtin.

Release note: None
tbg added a commit to tbg/cockroach that referenced this issue Feb 17, 2021
1. Require admin role for `crdb_internal.node_inflight_trace_spans`
2. Add num_structured column to that table (this can be seen as
   temporary for testing; we can remove it once cockroachdb#55733 is fully
   addressed and payloads can be introspected separately).
3. Add workaround for failure to finish sql txn spans (cockroachdb#59315)
   for use in logic tests, which together with 2.) and 4.) enables
   a better `contention_event` test.
4. Add `crdb_internal.trace_id` builtin.

Release note: None
angelapwen pushed a commit to angelapwen/cockroach that referenced this issue Feb 17, 2021
1. Require admin role for `crdb_internal.node_inflight_trace_spans`
2. Add num_structured column to that table (this can be seen as
   temporary for testing; we can remove it once cockroachdb#55733 is fully
   addressed and payloads can be introspected separately).
3. Add workaround for failure to finish sql txn spans (cockroachdb#59315)
   for use in logic tests, which together with 2.) and 4.) enables
   a better `contention_event` test.
4. Add `crdb_internal.trace_id` builtin.

Release note: None
tbg added a commit to tbg/cockroach that referenced this issue Feb 17, 2021
1. Require admin role for `crdb_internal.node_inflight_trace_spans`
2. Add num_structured column to that table (this can be seen as
   temporary for testing; we can remove it once cockroachdb#55733 is fully
   addressed and payloads can be introspected separately).
3. Add workaround for failure to finish sql txn spans (cockroachdb#59315)
   for use in logic tests, which together with 2.) and 4.) enables
   a better `contention_event` test.
4. Add `crdb_internal.trace_id` builtin.

Release note: None
craig bot pushed a commit that referenced this issue Feb 18, 2021
59490: add licensing info to README r=bdarnell a=gemma-shay

Added a section on licensing info.
Links to the new Licensing doc (#9338) will not work yet, so will wait to merge until they are up.
Closes cockroachdb/docs#7441

60513: sql: ensure REGIONAL BY ROW statements roundtrip r=ajstorm a=otan

Resolves #59362

See individual commits for details.

60552: sql: add descriptor validation on write r=postamar a=postamar

Previously, we didn't systematically validate descriptors when they were
written. Furthermore, there existed no common method to validate
descriptors across all descriptor subtypes

This commit adds three methods to the catalog.Descriptor interface:
  1. ValidateSelf ( context.Context ) error
  2. Validate ( context.Context, catalog.DescGetter ) error
  3. ValidateTxnCommit ( context.Context, catalog.DescGetter) error

Each performs a subset the checks performed by the next. ValidateSelf
contains all checks which can be performed in isolation, Validate also
performs all those involving DescGetters (i.e. cross-reference checks)
and ValidateTxnCommit also performs checks which should only be done at
commit-time. An example of the latter is checking that a table has
a primary key: dropping the PK is allowed within a transaction as long
as a new PK is subsequently provided before committing.

This commit adds new validation calls when writing descriptors:
  1. ValidateSelf is called prior to Collection adding a descriptor Put
     to a kv.Batch. At this point, we want descriptors to be at least
     internally-consistent, both to catch validation errors early and
     because it's not possible to do cross-reference checking at this
     point (changes on FKs for instance involve multiple descriptors).
  2. ValidateTxnCommit is called on the descs.Collection's uncommitted
     descriptors when the corresponding txn is about to commit, just
     prior to the two-version-invariant check.

These validations may be disabled using a new cluster setting:
    sql.catalog.descs.validate_on_write.enabled
Setting this to false makes it possible to corrupt the descriptor state
using the crdb_internal.unsafe_* functions.

Release note: None

60616: builtins: add builtin to retrieve the payload(s) for a span. r=knz,irfansharif a=angelapwen

Part of addressing #55733 

The `payloads_for_span` builtin retrieves all payloads for
a given span ID, given that the span is part of an active trace.
The payloads are returned in JSONB format. If the span is not
found, or if the span does not have any payloads, the builtin
returns an empty JSON object.

With the appropriate usage of this builtin and the
`crdb_internal.trace_id` builtin as shown in the `contention_event`
logic test, all payloads for the current trace may be surfaced.

Release note (sql change): add `payloads_for_span` builtin that
takes in a span ID and returns its paylods in JSONB format. If
the span is not found, or if the span does not have any payloads,
the builtin returns an empty JSON object.

60692: sql: add tests for inverted indexes on virtual columns r=mgartner a=mgartner

No code changes were needed to support inverted indexes on virtual
columns.

Release note: None

Co-authored-by: Gemma Shay <[email protected]>
Co-authored-by: Oliver Tan <[email protected]>
Co-authored-by: Marius Posta <[email protected]>
Co-authored-by: angelapwen <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
@angelapwen
Copy link

angelapwen commented Feb 22, 2021

Status update here:

@tbg added a builtin that retrieves the current trace ID as part of #59992.
We are holding off on adding a builtin that retrieves the current span ID until we see a use case for it.

I added a builtin that, given a span ID, displays a table representing all its payloads with columns for payload type and entire payload as JSONB #60784

What's left is adding functionality (via a SQL query delegate mechanism) to display a table representing all payloads for a given trace ID as that will be easier to use. Also, we want to improve the logic test for contention events

craig bot pushed a commit that referenced this issue Feb 22, 2021
60784: builtins: add generator builtin for span payloads r=irfansharif,knz a=angelapwen

Following up from #60616. Part of addressing #55733.

Previously we introduced a `crdb_internal.payloads_for_span`
builtin that returned a JSONB array representing all payloads for
a particular span. To improve the user experience of the builtin
as well as prevent OOMs resulting from builtin usage, we replace it
with a generator builtin that returns a table representing all payloads
for a span, where each row represents a single payload.

The new builtin, also called `crdb_internal.payloads_for_span`, has
columns for the `payload_type` so that the user has the ability to
easily filter on the type, and `payload_jsonb` so the user can use
jsonb builtins to filter further.

Release note (sql change): Update `crdb_internal.payloads_for_span`
builtin to return a table instead of a JSONB array. Each row of the
table represents one payload for the given span. It has columns for
`payload_type` and `payload_jsonb`.

Co-authored-by: angelapwen <[email protected]>
craig bot pushed a commit that referenced this issue Feb 25, 2021
57183: importccl,sql: support importing non-public schemas from pgdump r=adityamaru a=adityamaru

Previously, a PGDUMP import did not support any non-public schema
statements. Now that CRDB has user defined schemas, bundle format
IMPORTs need to be taught how to parse, create and cleanup schema
related PGDUMP operations.

Note this PR only adds support for `CREATE SCHEMA` and usage of the
schema in `CREATE TABLE/SEQUENCE` PGDUMP statements. `ALTER SCHEMA`
statements are ignored and support might be added in a follow up.

Release justification: low risk, high reward. Import PGDUMP does not support user
defined schemas.

Release note (sql change): IMPORT PGDUMP can now import dump files with
non-public schemas.

60922: sql,cli: add payloads_for_trace builtin r=knz a=angelapwen

Possibly the final step to #55733. Resolves #58608 😄 

Previously it was quite cumbersome to view all payloads for a given
trace: we needed to join on the `node_inflight_trace_spans` vtable
to filter for span IDs that match a trace ID, then apply the
`payloads_for_span()` builtin to each span ID. This patch adds
syntactic sugar to the above query.

Instead of

```
WITH spans AS (
  SELECT span_id
  FROM crdb_internal.node_inflight_trace_spans
  WHERE trace_id = $TRACE_ID)
) SELECT *
  FROM spans, LATERAL crdb_internal.payloads_for_span(spans.span_id);
```

we can now simply use:
```
crdb_internal.payloads_for_trace($TRACE_ID);
```

and achieve the same result. The patch also adds all payloads for all
long-running spans to the `crdb_internal.node_inflight_trace_spans`
table of the debug.zip file.

Release note (sql change): Add `payloads_for_trace()` builtin so that
all payloads attached to all spans for a given trace ID will be
displayed, utilizing the `crdb_internal.payloads_for_span()`
builtin under the hood. All payloads for long-running spans are also
added to debug.zip in the `crdb_internal.node_inflight_trace_spans`
table dump.

Co-authored-by: Tobias Grieger <[email protected]>

Release justification: This patch is safe for release because it
adds syntactic sugar to an internal observability feature.

61094: kvserver: re-add spuriously removed nil check in `relocateOne` r=aayushshah15 a=aayushshah15

bce8317 introduced a bug by spuriously
removing a nil check over the result of a call to
`allocateTargetFromList`. This commit re-adds the check.

The bug could cause a panic when `AdminRelocateRange` was called by the
`StoreRebalancer` or the `mergeQueue` if one (or more) of the stores
that are supposed to receive a replica for a range become unfit for
receiving the replica (due to balancing reasons / or shifting
constraints) _between when rebalancing decision is made and when it's
executed_.

Resolves #60812

Release justification: fixes bug that causes a panic
Release note: None

61097: opt: use computed columns to improve FDs and remove uniqueness checks r=rytaft a=rytaft

**opt: use computed columns to build functional dependencies**

This commit updates `MakeTableFuncDep` so that it adds equivalencies
or synthesized columns to the table FDs for each of the computed
columns available in the metadata. This will be necessary to support
removing uniqueness checks in some cases in a future commit.

Release justification: This commit is a low risk, high benefit change
to existing functionality.

Release note (performance improvement): The optimizer now infers
additional functional dependencies based on computed columns in tables.
This may enable additional optimizations and lead to better query plans.

**opt: remove uniqueness checks when uniqueness inferred through FDs**

This commit removes uniqueness checks for columns that can be
inferred to be unique through functional dependencies. This is
relevant in particular for `REGIONAL BY ROW` tables with a computed
region column that depends on the primary key. In this case,
uniqueness checks are never needed on the primary key, since
uniqueness is already guaranteed by the primary index.

Fixes #57720

Release justification: This commit is a low-risk, high benefit
update to new functionality.

Release note (performance improvement): Removed uniqueness checks
on the primary key for REGIONAL BY ROW tables with a computed
region column that is a function of the primary key columns.
Uniqueness checks are not necessary in this case since uniqueness
can be suitably guaranteed by the primary index. Removing these
checks improves performance of INSERT, UPDATE, and UPSERT
statements.

61100: diagnostics: lock while populating hardware information r=andy-kimball a=andy-kimball

The shirou/gopsutil/host library that we use to gather hardware information
during diagnostics reporting is not multi-thread safe. As one example, it
lazily initializes a global map the first time the Virtualization function
is called, but takes no lock while doing so. Work around this limitation by
taking our own lock.

This code never triggered race conditions before, but is doing so after recent
changes I made to the diagnostics reporting code. Previously, we were using a
single background goroutine to do both diagnostics reporting and checking for
updates. Now we are doing each of those on different goroutines, which triggers
race detection.

Fixes #61091

Release justification: fixes for high-priority or high-severity bugs in existing
functionality
Release note: None

61105: builtins: implement ST_MakePoint and ST_MakePointM r=otan,rafiss a=andyyang890

This patch implements the geometry builtins `ST_MakePoint`
and `ST_MakePointM`.

Resolves #60857.
Resolves #60858.
Resolves #60859.

Release justification: low-risk update to new functionality
Release note (sql change): The geometry builtins `ST_MakePoint`
and `ST_MakePointM` have been implemented and provide a mechanism
for easily creating new points.

Co-authored-by: Aditya Maru <[email protected]>
Co-authored-by: angelapwen <[email protected]>
Co-authored-by: Aayush Shah <[email protected]>
Co-authored-by: Rebecca Taft <[email protected]>
Co-authored-by: Andrew Kimball <[email protected]>
Co-authored-by: Andy Yang <[email protected]>
@angelapwen
Copy link

The builtin displaying all payloads for a given trace ID has been added (#60992) and logic test improved as part of the same PR. Closing out this issue for now 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-observability A-sql-observability Related to observability of the SQL layer A-tracing Relating to tracing in CockroachDB. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

No branches or pull requests

3 participants