Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/util/log: don't falsify tenant ID tag in logs if none in ctx #95789

Merged
merged 1 commit into from
Mar 14, 2023

Conversation

abarganier
Copy link
Contributor

Previously, I made the decision to always tag a log entry with a tenant ID, even if no tenant ID was found in the context associated with the log entry. In this case, the system tenant ID was used in the tag, instead of omitting a tenant ID tag altogether.

I received some feedback that this is confusing. For example, imagine testing a feature, expecting log entries to come from a secondary tenant, and the context being used in that feature is not annotated with a tenant ID. With the previous behavior, the log entry would default to being tagged with the system tenant ID instead of having empty tags (or at least, no tenant ID tag). In this scenario, how do I tell the actual state of the log entry? Did the log entry indeed come from a goroutine belonging to the system tenant? Or was the context just missing the tenant ID annotation, but otherwise came from the correct tenant?

This ambiguity is not helpful. By falsifying a tenant ID tag we confuse the log reader about the actual state of the system. Furthermore, our eventual goal should be that almost no context objects in the system exist without a tenant ID (except for perhaps at startup before tenant initialization). Tagging with the system tenant ID in the case of a missing tenant ID annotation in the context makes it difficult to track down offending context objects.

This patch removes this default behavior from the logging package. Now, if no tenant ID is found in the context, we do not tag the entry with a tenant ID. Note however that on the decode side, we will maintain this default tenant ID tagging behavior. If a log entry does not have a tenant ID tag, then we must assume that only the system tenant has privilege to view said log entry, since the owner is ambiguous.

Release note: none

Epic CRDB-14486

@abarganier abarganier requested review from andreimatei and a team January 24, 2023 20:31
@abarganier abarganier requested review from a team as code owners January 24, 2023 20:31
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@abarganier
Copy link
Contributor Author

@knz I was looking into making the changes you recommended in this comment as a part of this PR. Can you elaborate a bit on how you think the conversion should work?

The byte slice we're dealing with when parsing out the tenant ID is a piece of an encoded tags string (e.g. [T123,nsql?,othertag=foobar]).

I am having difficulty finding a good way to convert this 123 directly from []byte to int without an intermediate conversion to string. I found posts like this one that recommend using the ByteOrder package, but I'm running into various issues with this approach. It seems like the package requires the encoded bytes to be the binary representation of a number (understandable), instead of something like the ASCII/utf-8 encoding. I'm struggling to find other options that don't involve an intermediate string conversion. Am I missing something? Perhaps there's a package I'm unaware of.

I want to do my best to optimize the parsing of the tenant ID tags here, but if there's an intermediate string conversion, it seems to defeat the purpose. Let me know your thoughts 🙏

@abarganier abarganier force-pushed the ua-logging-qol branch 2 times, most recently from 2b679cc to bf1a845 Compare January 24, 2023 20:53
Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate a bit on how you think the conversion should work?

So I had read some minutes of the go team review meetings a while ago where there was a suggestion to add such a function to strconv, and I had assumed that had happened already. Sadly it has not. It's pretty straightforward though -- we can utilize the fact that we know the number will fit in a uint64 and that there's a non-digit after the last digit to make this trivial:

func parsenum(b []byte) (result uint64, rest) {
   for {
       if b[0] < '0' || b[0] > '9' {
          return result, b
       }
       result  = result * 10 + uint64(b[0]-'0')
       b = b[1:]
  }
}

(if you're not sure if there's a character after the last digit you can change that for to for ; len(b)>0 ; .)

Reviewed 11 of 11 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)


-- commits line 35 at r1:
This is a very good commit message, I found it supremely enlightening. 💯

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant rest []byte in the 2nd result

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @abarganier)


pkg/util/log/format_json.go line 297 at r1 (raw file):

		buf.WriteString(entry.NodeID)
	}
	// We must always tag with the Tenant ID if present.

I'm not sure what to make of this comment. "must always" doesn't seem to go with "if present". This stanza now looks like all the others, which don't have such a comment. Consider removing.

Previously, we made the decision to always tag a log entry with a
tenant ID, even if no tenant ID was found in the context associated
with the log entry. In this case, the system tenant ID was used
in the tag, instead of omitting a tenant ID tag altogether.

I received some feedback that this is confusing. For example,
imagine testing a feature, expecting log entries to come from
a secondary tenant, and the context being used in that feature
is not annotated with a tenant ID. With the previous behavior,
the log entry would default to being tagged with the system tenant
ID instead of having empty tags (or at least, no tenant ID tag).
In this scenario, how do I tell the actual state of the log entry?
Did the log entry indeed come from a goroutine belonging to the
system tenant? Or was the context just missing the tenant ID
annotation, but otherwise came from the correct tenant?

This ambiguity is not helpful. By falsifying a tenant ID tag we
confuse the log reader about the actual state of the system.
Furthermore, our eventual goal should be that almost no context
objects in the system exist without a tenant ID (except for perhaps
at startup before tenant initialization). Tagging with the
system tenant ID in the case of a missing tenant ID annotation
in the context makes it difficult to track down offending context
objects.

This patch removes this default behavior from the logging package.
Now, if no tenant ID is found in the context, we do not tag the
entry with a tenant ID. Note however that on the *decode* side,
we will maintain this default tenant ID tagging behavior. If a log
entry does not have a tenant ID tag, then we must assume that only
the system tenant has privilege to view said log entry, since the
owner is ambiguous.

Release note: none
Copy link
Contributor Author

@abarganier abarganier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay, this one got lost in the fray.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @knz)


pkg/util/log/format_json.go line 297 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I'm not sure what to make of this comment. "must always" doesn't seem to go with "if present". This stanza now looks like all the others, which don't have such a comment. Consider removing.

Yeah, good point. Removed.

@abarganier
Copy link
Contributor Author

TFTR!

bors r=andreimatei

@craig
Copy link
Contributor

craig bot commented Mar 14, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 14, 2023

This PR was included in a batch that was canceled, it will be automatically retried

@abarganier
Copy link
Contributor Author

bors retry

@craig craig bot merged commit a36d88b into cockroachdb:master Mar 14, 2023
@craig
Copy link
Contributor

craig bot commented Mar 14, 2023

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants