Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.2: util/log: conditionally include the tenant+server IDs on every line #58708

Merged

Conversation

knz
Copy link
Contributor

@knz knz commented Jan 11, 2021

Fixes #58705.

For CC security logging we want the ability to route the logging
events from the files where they are written into a centralized
logging collector.

However this routing is done line-by-line. To enable log aggregation
across multiple clusters, or multiple nodes, we need to disambiguate
which log entries come from which cluster and which node.

This patch accommodates this requirement by adding the cluster ID and,
for tenant servers, the tenant and SQL instance ID, on every output
line when the env var COCKROACH_ALWAYS_LOG_SERVER_IDS is set to a
true-ish value.

Note: this feature is unneeded in v21.1 because in that version JSON
logging is available and that already includes the server identity
bits.

Release note: None

@knz knz requested review from tbg and andy-kimball January 11, 2021 16:01
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz
Copy link
Contributor Author

knz commented Jan 11, 2021

NB: The PR is draft still because this needs testing with SQL tenant servers, not just KV pods.

Also hoping to get input from @tbg about where to best position the log.Set calls. Is the change in testserver.go adequate?

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball and @knz)


pkg/server/testserver.go, line 718 at r1 (raw file):

	}

	// Inform the logging package of the cluster identifiers.

Yeah, this seems fine. There's a chance of a few log lines going out before this line, which I'm sure you're aware of. But the clusterID isn't known at startup, so this is hard to avoid.

@knz knz force-pushed the 20210111-clusterid-in-logs branch from 00aee5f to dfb01b6 Compare January 13, 2021 13:42
@knz knz marked this pull request as ready for review January 13, 2021 13:43
@knz knz requested a review from a team January 13, 2021 13:43
@knz knz requested a review from a team as a code owner January 13, 2021 13:43
@knz knz requested review from adityamaru and removed request for a team January 13, 2021 13:43
@knz knz force-pushed the 20210111-clusterid-in-logs branch 3 times, most recently from 60e315b to 301152a Compare January 13, 2021 15:30
Copy link
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @andy-kimball, @knz, and @tbg)


pkg/server/testserver.go, line 718 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Yeah, this seems fine. There's a chance of a few log lines going out before this line, which I'm sure you're aware of. But the clusterID isn't known at startup, so this is hard to avoid.

I don't think sqlServerOptionalKVArgs is supposed to be available for SQL-only nodes, is it?

// sqlServerOptionalKVArgs are the arguments supplied to newSQLServer which are
// only available if the SQL server runs as part of a KV node.

I think you want to use s.SQLInstanceID() to get it instead.


pkg/util/log/clog.go, line 86 at r2 (raw file):

	// The following identifiers are reported when enabled.
	tenantID      syncutil.AtomicString
	sqlInstanceID int32

I'd beef up the comment to specify that sqlInstanceID must always be accessed atomically.

Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @andy-kimball, and @tbg)


pkg/server/testserver.go, line 718 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

I don't think sqlServerOptionalKVArgs is supposed to be available for SQL-only nodes, is it?

// sqlServerOptionalKVArgs are the arguments supplied to newSQLServer which are
// only available if the SQL server runs as part of a KV node.

I think you want to use s.SQLInstanceID() to get it instead.

Done.


pkg/util/log/clog.go, line 86 at r2 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

I'd beef up the comment to specify that sqlInstanceID must always be accessed atomically.

Done.

@knz knz force-pushed the 20210111-clusterid-in-logs branch 2 times, most recently from d9a1bca to 9837545 Compare January 13, 2021 16:30
For CC security logging we want the ability to route the logging
events from the files where they are written into a centralized
logging collector.

However this routing is done line-by-line. To enable log aggregation
across multiple clusters, or multiple nodes, we need to disambiguate
which log entries come from which cluster and which node.

This patch accommodates this requirement by adding the cluster ID and,
for tenant servers, the tenant and SQL instance ID, on every output
line when the env var `COCKROACH_ALWAYS_LOG_SERVER_IDS` is set to a
true-ish value.

Note: this feature is unneeded in v21.1 because in that version JSON
logging is available and that already includes the server identity
bits.

Release note: None
@knz knz force-pushed the 20210111-clusterid-in-logs branch from 9837545 to ebbc531 Compare January 13, 2021 18:24
@knz
Copy link
Contributor Author

knz commented Jan 13, 2021

@logston I've been baby sitting this PR to finish its CI and it's taking me beyond working hours. I think it will all right in the end but the teamcity agents get restarted (probably because of some transient issues). Feel free to merge this when it's green. I'm signing off for the day.

@logston
Copy link
Contributor

logston commented Jan 13, 2021

Will do. Sleep well @knz!

Copy link
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @andy-kimball, and @tbg)

@andy-kimball andy-kimball merged commit 3e61d3d into cockroachdb:release-20.2 Jan 13, 2021
@andy-kimball
Copy link
Contributor

I just merged it. Thanks @knz for getting it over finish line.

craig bot pushed a commit that referenced this pull request Jan 15, 2021
58275: roachprod: Support AWS GP3 drives. r=petermattis a=miretskiy

Add support for spinning up VMs with GP3 drives.
Make it possible to specify multiple, possibly different EBS
volumes for each aws instance.

Release Notes: None

58862: server,cli: ensure the server identifiers are properly logged in SQL pds r=itsbilal a=knz

This is a forward-port of a change (#58708) we've discovered was necessary in v20.2.

Release note: None

Co-authored-by: Yevgeniy Miretskiy <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants