-
Notifications
You must be signed in to change notification settings - Fork 970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3437 improve handling of slow archives #4090
3437 improve handling of slow archives #4090
Conversation
edb4496
to
0987437
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Off to a good start! I think it may be useful to step up a level and do the metrics and timings in GetHistoryArchiveStateWork
. What you could do is still set the number of retries for GetHistoryArchiveStateWork
. In the else if (state == State::WORK_FAILURE && archive)
branch after you see a failure, you can then update the metrics with the time and count. This would also avoid noisy metrics GetAndUnzipRemoteFileWork
, as I think we're only interested in GetHistoryArchiveStateWork
(I'm not sure on this point though and need to double check).
0987437
to
39cc61c
Compare
I’m not sure if the semantics of I think what makes the most sense is to keep the metrics in this PR (which we definitely want), but move the config issues to a follow up issue where we can discuss a little more what we want. I agree with Marta’s |
39cc61c
to
bc1c430
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty close! A few minor nits, and I think another class needs to inherit from DownloadWorkWithMetrics
.
@@ -0,0 +1,37 @@ | |||
// Copyright 2015 Stellar Development Foundation and contributors. Licensed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Copyright should be 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy new year 🎉
@@ -0,0 +1,28 @@ | |||
// Copyright 2015 Stellar Development Foundation and contributors. Licensed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: update copyright
@@ -4,7 +4,7 @@ | |||
|
|||
#pragma once | |||
|
|||
#include "historywork/RunCommandWork.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This include should not have changed.
@@ -13,7 +14,7 @@ namespace stellar | |||
class HistoryArchive; | |||
class GetRemoteFileWork; | |||
|
|||
class GetAndUnzipRemoteFileWork : public Work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think GetHistoryArchiveStateWork
should also inherit from DownloadWorkWithMetrics
.
0e39ed1
to
8bd904e
Compare
|
||
namespace stellar | ||
{ | ||
|
||
class HistoryArchive; | ||
class GetRemoteFileWork; | ||
|
||
class GetHistoryArchiveStateWork : public Work | ||
class GetHistoryArchiveStateWork : public DownloadWorkWithMetrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't catch this initially, but it looks like there's an inconsistency here with the mBytesPerSecond
metric. The issue is the meter assumes that the download finishes whenever the work that inherits DownloadWorkWithMetrics
finishes. For GetHistoryArchiveStateWork
, this is fine because the entire class is effectively a wrapper around GetRemoteFileWork
and does no additional work. However, GetAndUnzipRemoteFileWork
first downloads the file, then unzips it, then finishes. This means that even if the speed of the download is identical, files downloaded via GetAndUnzipRemoteFileWork
will always appear to be slower because you count unzip time.
I think the easiest way to fix this is probably just to do your metering in GetRemoteFileWork
. There's no non-history related work that uses this, so we can probably just do this metering directly in the class. A wrapper class like DownloadWithMetrics
would be useful if GetRemoteFileWork
was used for use cases outside of history downloads, but I don't see any. You can mark the bytesPerSecond
value in the onSuccess
function still. Since we never retry the work directly, you can mark the retry/failure meter in onFailureRaise
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, good catch on the unzip time being included in the denominator. Regarding moving the metrics to GetRemoteFileWork
, it doesn't make sense to call the metric history.get.retry
(as its not exactly a retry when incremented from a failing GetRemoteFileWork
), perhaps history.get.failure
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
8bd904e
to
c81c740
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! A nice clean change after all the back and forth.
r+ c81c740 |
Description
Resolves #3437
Checklist
clang-format
v8.0.0 (viamake format
or the Visual Studio extension)