-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved graceful shutdown - Agent #2031
Improved graceful shutdown - Agent #2031
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Long-overdue refactoring!
all-in-one/main should be much smaller indeed, and just reuse the components start sequence. It also makes the start method testable.
Status: this is taking quite some time, as it requires moving other things around as well, to avoid cyclic dependencies between packages. I think I'll split this into two tasks: the graceful shutdown for the collector, and the refactoring for the main.go files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
general top-level comment: please think of the ownership of closables
Codecov Report
@@ Coverage Diff @@
## master #2031 +/- ##
==========================================
+ Coverage 96.35% 96.37% +0.01%
==========================================
Files 214 214
Lines 10532 10532
==========================================
+ Hits 10148 10150 +2
+ Misses 326 325 -1
+ Partials 58 57 -1
Continue to review full report at Codecov.
|
@@ -131,6 +137,10 @@ func (h *zipkinSpanHandler) SubmitZipkinBatch(spans []*zipkincore.Span, options | |||
return responses, nil | |||
} | |||
|
|||
func (h *zipkinSpanHandler) Close() error { | |||
return h.modelProcessor.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like we might be closing this twice? It's the ownership issue again - handlers don't create processors, so they should not be closing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are absolutely right. This wasn't a problem because the processor's #Close()
was checking whether it had been closed before. I kept this safety check, but extracted the building of the span processor into its own function. The collector now builds it explicitly and passes to the handler builder.
While I don't quite like that the collector calls builder.BuildProcessor()
and then builder.BuildHandlers(processor)
, I think it's the best option that makes sense without building another abstraction with its own #Close()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if Collector builds the processor explicitly, shouldn't it be in charge of closing it as well, rather than having the handlers do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...
What if I told you that, after all that I wrote before, I say that I actually forgot to remove it from the handler's #Close()
methods...
Hopefully, it's now fixed in this last commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we split agent and collector changes into different PRs? E.g. git checkout master cmd/collector
to get the agent-only changes. There's too much going on here.
cmd/agent/app/agent.go
Outdated
} | ||
|
||
// NewAgent creates the new Agent. | ||
func NewAgent( | ||
proxy CollectorProxy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need Agent to be responsible for closing CollectorProxy if it didn't create it? There may be other services using that connection beyond the main agent's services.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. It makes the all-in-one a bit more complicated for now, but it is indeed the right thing to do.
The question is whether we'd want the Agent to start the proxy. Right now, it does not, but I believe it's the only consumer of this proxy.
cmd/agent/app/builder.go
Outdated
@@ -69,6 +69,7 @@ var ( | |||
type CollectorProxy interface { | |||
GetReporter() reporter.Reporter | |||
GetManager() configmanager.ClientConfigManager | |||
Close() error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pending decision on whether agent should be in charge of closing the proxy. If not, then this Close() method may be unnecessary since whoever created the proxy would have access to Close()
on the concrete type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the cost of leaving this here? Having it as part of the interface forces new implementations to at least think about the #Close()
operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having Close() in an interface is pretty often an anti-pattern that indicates a problem elsewhere. But in some cases it is appropriate when the object that holds reference to this type is indeed requires an interface, because it either deals with different implementations or a decorator pattern is used that requires the decorator to accept an interface. In this concrete case I don't have a strong opinion, aside from my general preference of not introducing things until they are actually necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example of the "cost" is cmd/agent/app/builder_test.go
which now needs to add no-op Close().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I'll remove it, we can add it back if there's a need in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm adding this back, because both implementations actually create a connection on their own and need to properly close it. The only implementation that does not need a #Close
is the test.
@@ -23,6 +23,7 @@ import ( | |||
grpcManager "github.com/jaegertracing/jaeger/cmd/agent/app/configmanager/grpc" | |||
"github.com/jaegertracing/jaeger/cmd/agent/app/reporter" | |||
aReporter "github.com/jaegertracing/jaeger/cmd/agent/app/reporter" | |||
"github.com/jaegertracing/jaeger/pkg/multierror" | |||
) | |||
|
|||
// ProxyBuilder holds objects communicating with collector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's rename this to CollectorProxy (but ok to do in another PR)
cmd/collector/app/collector.go
Outdated
@@ -169,7 +177,18 @@ func (c *Collector) Close() error { | |||
defer cancel() | |||
} | |||
|
|||
// by now, we shouldn't have any in-flight requests anymore | |||
// by now, we shouldn't have any in-flight requests anymore, close the processor and handlers | |||
for _, closer := range []io.Closer{c.spanProcessor, c.zipkinSpansHandler, c.jaegerBatchesHandler, c.grpcHandler} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all three span handlers the Close is no-op. What's the point of adding it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the case is strong for removing it, as we have three implementations and none needs to close anything, this seems like the type of thing that would require a closer in some implementations.
I can remove it, but the message that it sends is stronger than the fact that it's currently no-op.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The message is sends is that these handlers might be stateful in some way and require Close to clean-up, which is kind of the opposite of what they are now: stateless proxies that simply forward to collector (while not managing the life cycle of collector proxy). I am in favor of simplicity, rather than adding code that might possibly be needed in the future.
It also complicates our internal build since now we also have to call close on these handlers, to respect their API.
Could you improve the description by explaining what was wrong with the current state? |
Done! |
TIL! This PR has been updated to include only the agent changes. I also squashed the commits, as the original commit set doesn't make much sense anymore. |
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
@yurishkuro is this ready to be merged, once 1.17 is out? |
Which problem is this PR solving?
SIGTERM
:Short description of the changes