-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workspace download retries with read timeouts #8974
Comments
I've submitted two fixes for this issue:
I also think Would be happy to work on this if we achieve consensus 😊 |
We wanted to do this for a while. It's just been that we never got around to do it. Feel free to send patches! |
As part of investigation of #8974 I found that recent change 982e0b8 broke retries of `SocketTimeoutException` in `HttpConnector`, intention of that change was good but lack of tests resulted in broken logic. IntelliJ highlights the problem ? branch with `instanceof SocketTimeoutException` would have never been executed: <img width="764" alt="Screen Shot 2019-07-29 at 3 26 11 PM" src="https://user-images.githubusercontent.com/967132/62089179-d7bfa400-b21c-11e9-882a-1c1fe1fcb683.png"> --- This PR adds missing tests and fixes the logic to still present `SocketTimeoutException` as `IOException` for upstream consumers while handling it properly internally in the `HttpConnector`. Closes #9008. PiperOrigin-RevId: 261675244
As part of investigation of #8974 I found that recent change 982e0b8 broke retries of `SocketTimeoutException` in `HttpConnector`, intention of that change was good but lack of tests resulted in broken logic. IntelliJ highlights the problem ? branch with `instanceof SocketTimeoutException` would have never been executed: <img width="764" alt="Screen Shot 2019-07-29 at 3 26 11 PM" src="https://user-images.githubusercontent.com/967132/62089179-d7bfa400-b21c-11e9-882a-1c1fe1fcb683.png"> --- This PR adds missing tests and fixes the logic to still present `SocketTimeoutException` as `IOException` for upstream consumers while handling it properly internally in the `HttpConnector`. Closes #9008. PiperOrigin-RevId: 261675244
Fixes #8974. `HttpDownloader` never retried `IOException` that could have occurred during `ByteStreams.copy(payload, out)`, HttpConnector would have retried connection phase errors but not payload ones. This chageset adds fallback to next url(s) if present for both payload read errors and connection errors and adds (completely) missing tests for `HttpDownloader`. Note, that this PR technically disables questionable `HttpConnectorMultiplexer` optimization that attempts to connect to multiple urls at the same time (by starting threads that race with each other) and picking the one that connected faster. There is a way to keep that optimization while falling back to next urls, but it would require all exceptions to contain url that caused it so that `HttpDownloader` could retry download on other urls. I think making `HttpDownloader` more reliable and actually using provided `urls` as fallback is much more important than mentioned optimization. Closes #9015. PiperOrigin-RevId: 261702678
As part of investigation of #8974 I found that recent change 982e0b8 broke retries of `SocketTimeoutException` in `HttpConnector`, intention of that change was good but lack of tests resulted in broken logic. IntelliJ highlights the problem ? branch with `instanceof SocketTimeoutException` would have never been executed: <img width="764" alt="Screen Shot 2019-07-29 at 3 26 11 PM" src="https://user-images.githubusercontent.com/967132/62089179-d7bfa400-b21c-11e9-882a-1c1fe1fcb683.png"> --- This PR adds missing tests and fixes the logic to still present `SocketTimeoutException` as `IOException` for upstream consumers while handling it properly internally in the `HttpConnector`. Closes #9008. PiperOrigin-RevId: 261675244
Fixes #8974. `HttpDownloader` never retried `IOException` that could have occurred during `ByteStreams.copy(payload, out)`, HttpConnector would have retried connection phase errors but not payload ones. This chageset adds fallback to next url(s) if present for both payload read errors and connection errors and adds (completely) missing tests for `HttpDownloader`. Note, that this PR technically disables questionable `HttpConnectorMultiplexer` optimization that attempts to connect to multiple urls at the same time (by starting threads that race with each other) and picking the one that connected faster. There is a way to keep that optimization while falling back to next urls, but it would require all exceptions to contain url that caused it so that `HttpDownloader` could retry download on other urls. I think making `HttpDownloader` more reliable and actually using provided `urls` as fallback is much more important than mentioned optimization. Closes #9015. PiperOrigin-RevId: 261702678
As part of investigation of #8974 I found that recent change 982e0b8 broke retries of `SocketTimeoutException` in `HttpConnector`, intention of that change was good but lack of tests resulted in broken logic. IntelliJ highlights the problem ? branch with `instanceof SocketTimeoutException` would have never been executed: <img width="764" alt="Screen Shot 2019-07-29 at 3 26 11 PM" src="https://user-images.githubusercontent.com/967132/62089179-d7bfa400-b21c-11e9-882a-1c1fe1fcb683.png"> --- This PR adds missing tests and fixes the logic to still present `SocketTimeoutException` as `IOException` for upstream consumers while handling it properly internally in the `HttpConnector`. Closes #9008. PiperOrigin-RevId: 261675244
Fixes #8974. `HttpDownloader` never retried `IOException` that could have occurred during `ByteStreams.copy(payload, out)`, HttpConnector would have retried connection phase errors but not payload ones. This chageset adds fallback to next url(s) if present for both payload read errors and connection errors and adds (completely) missing tests for `HttpDownloader`. Note, that this PR technically disables questionable `HttpConnectorMultiplexer` optimization that attempts to connect to multiple urls at the same time (by starting threads that race with each other) and picking the one that connected faster. There is a way to keep that optimization while falling back to next urls, but it would require all exceptions to contain url that caused it so that `HttpDownloader` could retry download on other urls. I think making `HttpDownloader` more reliable and actually using provided `urls` as fallback is much more important than mentioned optimization. Closes #9015. PiperOrigin-RevId: 261702678
It seems like currently if bazel fails to download something from the WORKSPACE that has multiple URLs defined, it doesn't always fallback to the other URL. For example I hit this error:
When my WORKSPACE contained:
I would have expected this to fallback to the GitHub URL before failing.
What's the output of
bazel info release
?release 0.28.0
The text was updated successfully, but these errors were encountered: