-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpc,*: sanitize the configuration of rpc.Context #107868
Conversation
357ce25
to
8b4747d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 3 files at r1, 7 of 7 files at r5, 58 of 58 files at r6, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @herkolategan, @knz, @nkodali, @smg260, and @stevendanna)
pkg/base/config.go
line 350 at r6 (raw file):
// It is set after the server instance has been created, when the // network listeners are being set up. AdvertiseAddrH
For my own edification: what does H stand for? "Host"?
pkg/cli/rpc_client.go
line 82 at r4 (raw file):
} func addrWithDefaultHost(addr string) (string, error) {
nit: the removal of addrWithDefaultHost
should be in the first commit, not the second.
pkg/cli/rpc_client.go
line 28 at r5 (raw file):
) // getClientGRPCConn returns a ClientConn, a Clock and a method that blocks
nit: the comment on getClientGRPCConn
needs an update in the first commit.
Update: oh, it's removed in the third commit, up to you.
pkg/cli/democluster/demo_cluster.go
line 1077 at r6 (raw file):
// Find a server to use to send the Decommission request. for _, s := range c.servers { if s.adminClient != nil && s.nodeID != roachpb.NodeID(nodeID) {
It seems like previously we would get an admin client to the first node which might be the same as nodeID
, right? So in a sense we're fixing a bug here - should it be extracted in a separate commit then?
pkg/cli/democluster/demo_cluster.go
line 1183 at r6 (raw file):
conn, err := serv.RPCClientConnE(username.RootUserName()) if err != nil { serv.Stopper().Stop(context.Background())
nit: we do have ctx
in scope.
pkg/rpc/client.go
line 159 at r6 (raw file):
addr, err := addr.AddrWithDefaultLocalhost(cfg.ServerAddr) if err != nil { stopper.Stop(ctx)
nit: perhaps extract Stop
into the unconditional deferred function that Stop
s the stopper in all error cases.
pkg/rpc/context.go
line 319 at r6 (raw file):
// The following fields come from base.Config and only needed by // SecurityContextOptions. They are not really used by the 'rpc' // code. They really belong elsehere - see comments in
nit: s/elsehere/elsewhere/g
.
pkg/server/testserver.go
line 2345 at r6 (raw file):
) *rpc.Context { ctx = logtags.AddTag(ctx, "testclient", nil) ctx = logtags.AddTag(ctx, "user", user)
The original code also added nodeID as the tag - should we add SQL instance ID here?
pkg/server/testserver.go
line 2362 at r6 (raw file):
ccfg := rpc.MakeClientConnConfigFromBaseConfig(*cfg, user, nil, /* tracer */
Why do we not pass the tracer from s
? This probably deserves a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @herkolategan, @nkodali, @smg260, @stevendanna, and @yuzefovich)
pkg/base/config.go
line 350 at r6 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
For my own edification: what does H stand for? "Host"?
Handle.
I could also call it "container". Unsure what's best.
pkg/cli/rpc_client.go
line 82 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: the removal of
addrWithDefaultHost
should be in the first commit, not the second.
Done.
pkg/cli/rpc_client.go
line 28 at r5 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: the comment on
getClientGRPCConn
needs an update in the first commit.Update: oh, it's removed in the third commit, up to you.
Done.
pkg/cli/democluster/demo_cluster.go
line 1077 at r6 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
It seems like previously we would get an admin client to the first node which might be the same as
nodeID
, right? So in a sense we're fixing a bug here - should it be extracted in a separate commit then?
The firstServer is still special-cased elsewhere so i think the bug remains hidden. But point well taken. I've extracted it in a separate commit.
pkg/cli/democluster/demo_cluster.go
line 1183 at r6 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: we do have
ctx
in scope.
Fixed.
pkg/rpc/client.go
line 159 at r6 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: perhaps extract
Stop
into the unconditional deferred function thatStop
s the stopper in all error cases.
Good idea. Done.
pkg/rpc/context.go
line 319 at r6 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit:
s/elsehere/elsewhere/g
.
Fixed.
pkg/server/testserver.go
line 2345 at r6 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
The original code also added nodeID as the tag - should we add SQL instance ID here?
Done.
pkg/server/testserver.go
line 2362 at r6 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Why do we not pass the tracer from
s
? This probably deserves a comment.
Done.
5d2d5c9
to
f04f92f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I like this cleanup. Don't have much to add over Yahor's existing review.
pkg/base/config.go
Outdated
// It is set after the server instance has been created, when the | ||
// network listeners are being set up. | ||
AdvertiseAddrH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is a non-pointer, I'm assuming from the code that this is the value of the handle that everyone else takes a pointer of. I suppose that makes any function that takes base.Config rather than *base.Config a bit suspicious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a mixed bag really. Most places take the pointer indeed, but the few that don't seem to be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @herkolategan, @knz, @nkodali, @smg260, and @stevendanna)
pkg/base/config.go
line 350 at r6 (raw file):
Previously, knz (Raphael 'kena' Poss) wrote…
Handle.
I could also call it "container". Unsure what's best.
No opinion from me.
pkg/cli/democluster/demo_cluster.go
line 1028 at r9 (raw file):
} // findOtherServer returns a an admin RPC client to another
nit: "a an admin".
pkg/rpc/client.go
line 163 at r10 (raw file):
// even if the context used to create the client conn is // canceled. stopper.Stop(context.Background())
Switching to context.Background()
seems like a non-trivial change to me (granted, I'm not too familiar with this code). I did some archeology to see where usage of ctx
and the comment about "tying the lifetime" was introduced and landed on these two (1, 2) commits. Can you provide some more context for why we're making ctx -> context.Background()
switch? Perhaps someone from replication team (perhaps @erikgrinaker) could take a quick look at it too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @erikgrinaker, @herkolategan, @nkodali, @smg260, @stevendanna, and @yuzefovich)
pkg/cli/democluster/demo_cluster.go
line 1028 at r9 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: "a an admin".
Fixed.
pkg/rpc/client.go
line 163 at r10 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Switching to
context.Background()
seems like a non-trivial change to me (granted, I'm not too familiar with this code). I did some archeology to see where usage ofctx
and the comment about "tying the lifetime" was introduced and landed on these two (1, 2) commits. Can you provide some more context for why we're makingctx -> context.Background()
switch? Perhaps someone from replication team (perhaps @erikgrinaker) could take a quick look at it too?
This latter commit da5dca5 from Tobias introduced a bug. Consider:
@@ -939,31 +941,31 @@ func getClientGRPCConn(ctx context.Context) (*grpc.ClientConn, *hlc.Clock, error
stopper.AddCloser(stop.CloserFn(func() {
_ = conn.Close()
}))
// Tie the lifetime of the stopper to that of the context.
- go func() {
- <-ctx.Done()
- // Don't use the `ctx` as it's already cancelled.
- stopper.Stop(context.Background())
- }()
- return conn, clock, nil
+ closer := func() {
+ stopper.Stop(ctx)
+ }
+ return conn, clock, closer, nil
}
Observe: the original code (introduced in 7a3cdaa) was doing two separate things:
- waiting for the context to be cancelled before calling .Stop"
- using a separate context for the call to
.Stop
to ensure the already-cancelled context does not get passed to the closers.
Point (1) was considered to be a problem - instead we want the caller to call finish()
explicitly.
While that is true (and achieved by Tobi's patch), it introduced a new problem: the ctx
is passed directly to .Stop
now. This defeats the protection of point (2).
My PR here is restoring that protection.
f04f92f
to
f083f4f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aliher1911, @erikgrinaker, @herkolategan, @nkodali, @smg260, and @stevendanna)
pkg/rpc/client.go
line 163 at r10 (raw file):
Previously, knz (Raphael 'kena' Poss) wrote…
This latter commit da5dca5 from Tobias introduced a bug. Consider:
@@ -939,31 +941,31 @@ func getClientGRPCConn(ctx context.Context) (*grpc.ClientConn, *hlc.Clock, error stopper.AddCloser(stop.CloserFn(func() { _ = conn.Close() })) // Tie the lifetime of the stopper to that of the context. - go func() { - <-ctx.Done() - // Don't use the `ctx` as it's already cancelled. - stopper.Stop(context.Background()) - }() - return conn, clock, nil + closer := func() { + stopper.Stop(ctx) + } + return conn, clock, closer, nil }Observe: the original code (introduced in 7a3cdaa) was doing two separate things:
- waiting for the context to be cancelled before calling .Stop"
- using a separate context for the call to
.Stop
to ensure the already-cancelled context does not get passed to the closers.Point (1) was considered to be a problem - instead we want the caller to call
finish()
explicitly.
While that is true (and achieved by Tobi's patch), it introduced a new problem: thectx
is passed directly to.Stop
now. This defeats the protection of point (2).My PR here is restoring that protection.
Thanks, makes sense, I can sign off on this now. Do you think it's worth extracting this into a separate commit that is also backported? Perhaps we'll fix some extremely rare flakes? Perhaps it's worth extracting it into a separate commit regardless of the backport in order to provide the explanation from your message? Anyway, up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aliher1911, @erikgrinaker, @herkolategan, @nkodali, @smg260, @stevendanna, and @yuzefovich)
pkg/rpc/client.go
line 163 at r10 (raw file):
it's worth extracting this into a separate commit
ok, sure. Done.
that is also backported?
I'm not so sure about that. Until 23.2 (not released yet) all the contexts used in cli code were not cancellable, so this didn't hit.
f083f4f
to
810bf7a
Compare
810bf7a
to
607c697
Compare
TFYRs! bors r=yuzefovich,stevendanna |
bors r- |
Canceled. |
bors single on |
bors r- |
Canceled. |
This comment was marked as outdated.
This comment was marked as outdated.
1 similar comment
bors r=yuzefovich,stevendanna |
107868: rpc,*: sanitize the configuration of rpc.Context r=yuzefovich,stevendanna a=knz The main goal of this change is to offer a `.RPCClientConn()` method on test servers. To achieve this, it was necessary to lift the code previously in `cli.getClientGRPCConn` into a more reusable version of it, now hosted in `rpc.NewClientContext()`. I also took the opportunity to remove the dependency of `rpc.Context` on `base.Config`, by spelling out precisely which fields are necessary to RPC connections. Numerous tests could be simplified as a result. Needed for #107866. Epic: CRDB-18499 Co-authored-by: Raphael 'kena' Poss <[email protected]>
Build succeeded: |
961ec98
to
607c697
Compare
107866: testserver: remove direct casts to `*server.TestServer` r=yuzefovich,stevendanna a=knz First 3 commits from #107868. In a later stage we would like the result of `StartServer` to return different concrete types depending on parameters. Before we can do that however, we need to remove direct casts of that return value to `*server.TestServer`. Incidentally, this patch improves the APIs as follows: - the health probe for secondary tenant servers now properly includes the draining state of the RPC interface (previously, only the SQL draining state was included). - `ApplicationLayerInterface` now includes `Readiness()` and `DefaultZoneConfig()`. - `StorageLayerInterface` now includes `ScratchRangeWithExpirationLease()`, `GetRangeLease()`, `TsDB()`, `Locality()` and `DefaultSystemZoneConfig()`. Epic: CRDB-18499 Co-authored-by: Raphael 'kena' Poss <[email protected]>
The main goal of this change is to offer a
.RPCClientConn()
methodon test servers.
To achieve this, it was necessary to lift the code previously in
cli.getClientGRPCConn
into a more reusable version of it,now hosted in
rpc.NewClientContext()
.I also took the opportunity to remove the dependency of
rpc.Context
on
base.Config
, by spelling out precisely which fields are necessaryto RPC connections.
Numerous tests could be simplified as a result.
Needed for #107866.
Epic: CRDB-18499