-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring HTTP downloader progress reporter to accept multiple observers #3542
Refactoring HTTP downloader progress reporter to accept multiple observers #3542
Conversation
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
🌐 Coverage report
|
While working on this PR, I found a bug with the progress reporter. It was reporting progress just once (after a delay) instead of reporting it periodically. As such, this PR here is now blocked on the bugfix PR: #3548. |
f20ba32
to
3335534
Compare
This pull request is now in conflicts. Could you fix it? 🙏
|
3335534
to
c7618c9
Compare
22ddf67
to
9c04408
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a very tiny comments, it looks good otherwise
@@ -208,8 +206,9 @@ func (e *Downloader) downloadFile(ctx context.Context, artifactName, filename, f | |||
} | |||
} | |||
|
|||
lpObs := newLoggingProgressObserver(e.log, e.config.HTTPTransportSettings.Timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wild name, i had to go up here to figure it what it means
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in e9fc097.
length := dp.length | ||
interval := dp.interval | ||
|
||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very unnecessary optimization from my side: no need to spin up goroutine or initiate tickers if progressObservers is []. as we're having at least log now, i consider this just a small optimization from component side of view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, will add, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defer t.Stop() | ||
for { | ||
select { | ||
case <-ctx.Done(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're counting on external elements to cancel this. we could introduce Done chan internally which would be closed on ReportComplete or ReportFailed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the case where, for some reason, the consumer wants to cancel while the download is in progress (so before it's completed or failed)? I guess we could make it required as part of the interface's contract that either ReportComplete
or ReportFailed
MUST be called? Then we could safely remove the ctx
from Report
and handle cancellation as you suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented your suggestion in 67f80f6. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about creating a subcontext that can be closed when the request is complete or cancelled (ofc we need to hold a ref to its cancellation function) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about creating a subcontext that can be closed when the request is complete or cancelled (ofc we need to hold a ref to its cancellation function) ?
I like this pattern better as it doesn't require the consumer to remember to necessarily call either ReportComplete
or ReportFailed
. Basically, it gives the consumer two options on how to cancel: either cancel the ctx
passed to Report
or call ReportComplete
/ReportFailed
.
As far as implementation goes, we could do a subcontext or we use an internal done
channel as implemented in 67f80f6. Is one implementation necessarily better than the other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context cancellation is more "abandon processing" and the done channel is the simple solution to unblock a go routine.
The context has the advantage of propagation through the subcontexts if the parent context is cancelled/closed/expires, the done channel has to be managed/closed explicitly.
Since you have already implemented the done
channel you can keep that, I am just quite surprised to see that we only send an empty struct instead of closing the channel and removing the reference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, except I'm not seeing the benefit of using a subcontext over a done
channel in this case, because (if I understood your suggestion correctly) we would do something like this with a subcontext:
- create the subcontext with cancellation in the
Report
method , - add a case for it to be done in the
select
in theReport
method, replacing the current case for thectx
(parent context), - hold a reference to the cancellation function on the
progressReporter
struct, - and call this cancellation function from the
ReportComplete
andReportFailed
methods.
Whereas, using a done
channel pattern, we:
- create the
done
channel in the constructor - add a case for it to be done in the
select
in theReport
method, - hold a reference to it on the
progressReporter
struct, - send an empty struct to the
done
channel orclose
it from theReportComplete
andReportFailed
methods.
So I'm not seeing much of a benefit in this case of using a subcontext instead of a done
channel pattern. In particular, I don't see us taking advantage of the context propagation. Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pchila @michalpristas I updated the implementation such that there are now two ways to return from the goroutine inside Report
:
Report
is passed a context. When the context is done, the goroutine will return.- When the consumer calls
ReportComplete
orReportFailed
, we close an internaldone
channel, which will also cause the goroutine inReport
to return.
return n, nil | ||
} | ||
|
||
func (dp *downloadProgressReporter) Report(ctx context.Context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make a comment that caller is responsible for cancelling context, otherwise we're leaking goroutine and ticker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 89ca86b.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nits, nothing major.
Left a couple of questions...
@@ -46,13 +44,13 @@ const ( | |||
|
|||
// Downloader is a downloader able to fetch artifacts from elastic.co web page. | |||
type Downloader struct { | |||
log progressLogger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I always prefer the logger as interfaces for mocking and loose coupling, instead of relying on a more concrete type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I agree but I'm also a fan of YAGNI. We can easily change this back to an interface if/when we need it, i.e. have more than one implementation.
@@ -253,21 +237,21 @@ func assertLogs(t *testing.T, logs []logMessage, minExpectedProgressLogs int, ex | |||
// Verify that the first minExpectedProgressLogs messages are about the download progress (for the first file). | |||
i := 0 | |||
for ; i < minExpectedProgressLogs; i++ { | |||
assert.Equal(t, logs[i].record, expectedProgressMsg) | |||
assert.Regexp(t, expectedProgressRegexp, logs[i].Message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are other logs mixed in with the progress logs, this assert will fail the test. Is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. As things stand, the only other log that could get mixed up is this one:
elastic-agent/internal/pkg/agent/application/upgrade/artifact/download/http/downloader.go
Lines 102 to 104 in 359be4f
if err := os.Remove(path); err != nil { | |
e.log.Warnf("failed to cleanup %s: %v", path, err) | |
} |
This log should never get triggered but I suppose in some strange circumstances the os.Remove
could fail.
Any suggestions on how to make this more robust to only consider progress logs? Maybe the logs could first be filtered to only keep ones that match expectedProgressRegexp
or expectedCompletedRegexp
before running any assertions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filtering and then asserting length is surely one option. The exact semantics depend on what we are trying to assert here: from the code you are asserting that at least the first minExpectedProgressLogs
log entries must be the ones you expect: this will break as soon as an extra log entry is in the mix (it may even be added in the future).
Did you mean to assert that among the collected log entries you have at least minExpectedProgressLogs
expected logs ? In this case filter/count and then assert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me put up a commit showing what I meant by filtering and see what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internal/pkg/agent/application/upgrade/artifact/download/http/progress_reporter.go
Show resolved
Hide resolved
defer t.Stop() | ||
for { | ||
select { | ||
case <-ctx.Done(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about creating a subcontext that can be closed when the request is complete or cancelled (ofc we need to hold a ref to its cancellation function) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I tried to find something to complain about, I was unsuccessful. ;-)
a464077
to
ecc4201
Compare
SonarQube Quality Gate |
What does this PR do?
Prior to this PR, the HTTP downloader's
progressReporter
would report download progress in Agent logs. This functionality was implemented entirely within theprogressReporter
code.This PR refactors the
progressReporter
code to accept multiple observers and introduces a new observer,loggingProgressObserver
, that logs the observed progress in Agent logs, preserving the behavior prior to this PR.Why is it important?
This PR is purely a refactoring PR; it does not change any functionality. Functionally, the HTTP downloader's progress will continue to be reported in Agent logs.
However, in #3527, we will need the
progressReporter
to also report progress to another observer. This PR makes it easy to accomplish that in a clean manner. More context here: #3527 (comment)Checklist
I have made corresponding changes to the documentationI have made corresponding change to the default configuration filesI have added an entry in./changelog/fragments
using the changelog toolI have added an integration test or an E2E testRelated issues