Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

download_sysext: HTTP client timeout and retry #42

Merged
merged 2 commits into from
Dec 21, 2023

Conversation

dongsupark
Copy link
Member

@dongsupark dongsupark commented Dec 14, 2023

Set the default timeout to 20 seconds, so HTTP client can abort after that timeout.

If HTTP client fails to download, retry to download once in 1000 msec, up to 20 times.

Fixes #15

src/download.rs Outdated Show resolved Hide resolved
src/download.rs Outdated Show resolved Hide resolved
Copy link
Member

@pothos pothos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't retry in this case: During download I deactivated my wifi and the connection was stuck because we don't have the read timeout (see issue).
When activating my wifi again the download seems to continue but directly failed without retry:

read 13926158/28371306 ( 49%)Error: unable to download data(url Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("update.release.flatcar-linux.net")), port: None, path: "/amd64-usr/3815.0.0/oem-azure.gz", query: None, fragment: None })

Caused by:
    0: failed to get response chunk
    1: request or response body error: error reading a body from connection: end of file before message length reached
    2: error reading a body from connection: end of file before message length reached
    3: end of file before message length reached

src/bin/download_sysext.rs Outdated Show resolved Hide resolved
@pothos
Copy link
Member

pothos commented Dec 20, 2023

It doesn't retry in this case: During download I deactivated my wifi and the connection was stuck because we don't have the read timeout (see issue). When activating my wifi again the download seems to continue but directly failed without retry:

read 13926158/28371306 ( 49%)Error: unable to download data(url Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("update.release.flatcar-linux.net")), port: None, path: "/amd64-usr/3815.0.0/oem-azure.gz", query: None, fragment: None })

Caused by:
    0: failed to get response chunk
    1: request or response body error: error reading a body from connection: end of file before message length reached
    2: error reading a body from connection: end of file before message length reached
    3: end of file before message length reached

I think I see the problem: We only retry getting a response (the header) but we don't retry the full download. We need to wrap the whole function and can't split it into two parts.

src/download.rs Outdated Show resolved Hide resolved
@dongsupark dongsupark force-pushed the dongsu/client-timeout-retry branch 2 times, most recently from e50ddb8 to 3a9de50 Compare December 21, 2023 14:55
@dongsupark
Copy link
Member Author

As discussed, this PR is re-implemented without tryhard crate.

src/download.rs Outdated
bytes_read += chunk.len();

hasher.update(&chunk);
data.write_all(&chunk).context("failed to write_all chunk")?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This continuously appends, we need to truncate the file first to a length of 0 bytes before writing again.

Copy link
Member

@pothos pothos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not retry on "Connection refused"

@pothos
Copy link
Member

pothos commented Dec 21, 2023

If we want to keep ? for error handling we should retry the whole function - eg. with a helper function download_and_hash_retry that loops around it.

@pothos
Copy link
Member

pothos commented Dec 21, 2023

If we want to keep ? for error handling we should retry the whole function - eg. with a helper function download_and_hash_retry that loops around it.

With #44 this will be easier

@dongsupark
Copy link
Member Author

Brought back to a simple local wrapper version without tryhard, without passing File, without tokio, based on #44.

Copy link
Member

@pothos pothos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, works well now

src/bin/download_sysext.rs Outdated Show resolved Hide resolved
Set the default timeout to 20 seconds, so HTTP client can abort after
that timeout.
If HTTP client fails to download, retry to download once in 1000 msec,
up to 20 times.
@dongsupark dongsupark merged commit 79c38ef into trunk Dec 21, 2023
1 check passed
@dongsupark dongsupark deleted the dongsu/client-timeout-retry branch December 21, 2023 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Download retry
2 participants