Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracing: introduce a minimal tracing mode and enable in tests #73407

Merged
merged 3 commits into from
Dec 9, 2021

Conversation

andreimatei
Copy link
Contributor

@andreimatei andreimatei commented Dec 2, 2021

This patch makes the Tracer's behavior more configurable. Before, the
Tracer would generally not create spans when there was no explicit
reason to do so, unless a testing knob option was passed to the Tracer
creation (TracerTestingKnobs{ForceRealSpans: true}). This patch elevates
this configuration to a proper tracer option. The Tracer can be
configured to always create spans (useful for having all the in-flight
operations represented in the active spans registry), to not do that
(the old behavior), or to read the configuration from an env var. This
env var is useful while I test various things with the goal of
eventually enabling this tracing mode everywhere. For now the env var
defaults to tracing being off, except in CrdbTestBuild's (i.e. unit
tests), where it's on.

Some tests are changed to explicitly request tracing to be off, so that
they retain the old behavior, as they were relying on it. Among them, a
logic tests needs it, so a new cluster configuration option is added to
logic tests and test clusters.

After this patch, I think we're pretty close to enabling this new
tracing mode in production. Before that though, there's a couple more
performance optimizations to make, and more testing against
span-use-after-Finish.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@andreimatei andreimatei force-pushed the tracing.add-tracing-mode branch 2 times, most recently from 5e44f7b to 7a5262d Compare December 3, 2021 21:57
@andreimatei andreimatei marked this pull request as ready for review December 3, 2021 22:04
@andreimatei andreimatei requested review from a team as code owners December 3, 2021 22:04
@andreimatei andreimatei requested review from miretskiy and erikgrinaker and removed request for a team December 3, 2021 22:04
@andreimatei
Copy link
Contributor Author

Based on #73453 and #72993. Only the last 3 commits are to be reviewed here.
For commit jobs: remove picky test, I got a thumbs up from Aditya, so it can be ignored.

Copy link
Contributor

@abarganier abarganier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels significant, in a good way, and I'm excited that we're heading in a direction where we think a tracing mode like this is possible from a performance perspective. I just had one question regarding a comment, but otherwise :lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15 and @andreimatei)


pkg/util/tracing/tracer.go, line 160 at r12 (raw file):

	// ... If no
	// net.Trace/OpenTelemetry tracer is attached and a span is not explicitly
	// asked to record, spans do not record events.

This comment is slightly confusing to me. When we say "record events" here are we talking about log events (e.g. Eventf class of functions)?

@andreimatei andreimatei force-pushed the tracing.add-tracing-mode branch from 7a5262d to ffd489f Compare December 7, 2021 23:02
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm excited that we're heading in a direction where we think a tracing mode like this is possible from a performance perspective

You and me both. I think we're almost there.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15 and @abarganier)


pkg/util/tracing/tracer.go, line 160 at r12 (raw file):

Previously, abarganier (Alex Barganier) wrote…
	// ... If no
	// net.Trace/OpenTelemetry tracer is attached and a span is not explicitly
	// asked to record, spans do not record events.

This comment is slightly confusing to me. When we say "record events" here are we talking about log events (e.g. Eventf class of functions)?

Right. "recording" refers to spans accumulating log messages (log.Info/Eventf() calls which turn into span.Record()), as well as accumulating a history of their finished children.
I've improved the comment some; please see now.

Copy link
Contributor

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15, @abarganier, and @andreimatei)


pkg/sql/logictest/logic.go, line 170 at r25 (raw file):

//   enabled. This is equivalent to setting COCKROACH_EXPERIMENTAL_SPAN_CONFIGS.
// - tracing-off: If specified, tracing defaults to being turned off. This is
//   used to override the environment, which may ask for ttracing to be on by

ttracing

Copy link
Contributor

@abarganier abarganier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15, @abarganier, and @andreimatei)


pkg/util/tracing/tracer.go, line 160 at r12 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Right. "recording" refers to spans accumulating log messages (log.Info/Eventf() calls which turn into span.Record()), as well as accumulating a history of their finished children.
I've improved the comment some; please see now.

Thanks for clarifying, lgtm!

This test checks that a job record is not updated with a trace id if
tracing the job is not enabled. This test fails if tracing is generally
enabled across the cluster. The code under test is this:
https://github.com/cockroachdb/cockroach/blob/96a125ade349979f7ff7a42e23f39aa544c5529f/pkg/jobs/adopt.go#L401-L408

And, I'm trying to make it such that tracing is always enabled across
the cluster. So, I'm deleting this test, and also morally accepting that
there will usually be an extra update to the job record. Hopefully,
always getting that trace id in the job is a good thing.

Release note: None
This patch makes the Tracer's behavior more configurable. Before, the
Tracer would generally not create spans when there was no explicit
reason to do so, unless a testing knob option was passed to the Tracer
creation (TracerTestingKnobs{ForceRealSpans: true}). This patch elevates
this configuration to a proper tracer option. The Tracer can be
configured to always create spans (useful for having all the in-flight
operations represented in the active spans registry), to not do that
(the old behavior), or to read the configuration from an env var. This
env var is useful while I test various things with the goal of
eventually enabling this tracing mode everywhere. For now the env var
defaults to tracing being off, except in CrdbTestBuild's (i.e. unit
tests), where it's on.

Some tests are changed to explicitly request tracing to be off, so that
they retain the old behavior, as they were relying on it. Among them, a
logic tests needs it, so a new cluster configuration option is added to
logic tests and test clusters.

After this patch, I think we're pretty close to enabling this new
tracing mode in production. Before that though, there's a couple more
performance optimizations to make, and more testing against
span-use-after-Finish.

Release note: None
Tests are using this option when creating spans when they care about
tracing. Now that the Tracer can easily be configured to always create
spans, that's generally a cleaner options for tests then repeating this
option with every span creation.

Release note: None
@andreimatei andreimatei force-pushed the tracing.add-tracing-mode branch from ffd489f to c1c4adf Compare December 9, 2021 16:13
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @aayushshah15, @abarganier, and @andreimatei)


pkg/sql/logictest/logic.go, line 170 at r25 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

ttracing

Done.

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @aayushshah15, @abarganier, and @andreimatei)

@craig
Copy link
Contributor

craig bot commented Dec 9, 2021

Build succeeded:

@craig craig bot merged commit d1e0373 into cockroachdb:master Dec 9, 2021
@andreimatei andreimatei deleted the tracing.add-tracing-mode branch January 21, 2022 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants