-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CredentialHelper change broke unix domain socket support #16171
Comments
@bazel-io flag |
Sorry about that. This corner of Bazel is poorly tested; I guess something like this was bound to happen.
Could you please share the full stack trace and the exact values of |
Here's the command line and stack trace. The line numbers might be off a bit -- I recompiled bazel from the head and added a couple printf statements.
|
Thanks, this is helpful. It appears that That being said, I totally agree that we shouldn't crash here! I think the right fix is to replace the precondition check in |
Regardless of the number of slashes, it used to work with versions of bazel older than 5.3. I've used it on 5.2.0, 5.1.0, 5.0.0, and 4.2.2 to name a few. But I'll look into seeing how it behaves when we have only 1 slash and let you know what I find. |
Yep, regardless of the number of slashes, it fails at the same precondition check. (I tried 1, 2, and 3.) Also, for unix domain sockets, there is no host -- the entire remainder of the string is the pathname of the socket file, so getHost() probably should return null when the scheme is "unix:". |
@tjgq do you think this warrants a patch release? |
URIs such as `unix:///...` (Unix domain socket) may legitimately be missing a host component. We should gracefully handle them when looking for a credential helper (no such helper can exist for them since the lookup is host-based, but this could change in the future). Fixes bazelbuild#16171 which is a regression in 5.3. We had some tests for this in remote_execution_test.sh, but they were disabled due to flakiness in 8e3c0df.
URIs such as `unix:///...` (Unix domain socket) may legitimately be missing a host component. We should gracefully handle them when looking for a credential helper (no such helper can exist for them since the lookup is host-based, but this could change in the future). Fixes bazelbuild#16171 which is a regression in 5.3. We had some tests for this in remote_execution_test.sh, but they were disabled due to flakiness in 8e3c0df.
URIs such as `unix://...` (Unix domain socket) may legitimately be missing a host component. We should gracefully handle them when looking for a credential helper (no such helper can exist for them since the lookup is host-based, but this could change in the future). Fixes bazelbuild#16171 which is a regression in 5.3. We had some tests for this in remote_execution_test.sh, but they were disabled due to flakiness in 8e3c0df. I've filed bazelbuild#16185 to reenable them.
@Wyverald It's a regression that affects existing users and there doesn't seem to be a workaround for it, so probably yes? (I don't know what's our patch release policy.) @srago I'd like to go with #16184 as the fix. Could you please confirm that it works for you? cc @Yannic - until #16185 is fixed, we should manually verify that a UDS proxy works when making changes to credential helpers. |
@bazel-io fork 5.3.1 |
Confirmed #16184 fixes the problem with Unix domain sockets. Thanks for the quick response time! |
URIs such as `unix://...` (Unix domain socket) may legitimately be missing a host component. We should gracefully handle them when looking for a credential helper (no such helper can exist for them since the lookup is host-based, but this could change in the future). Fixes bazelbuild#16171 which is a regression in 5.3. We had some tests for this in remote_execution_test.sh, but they were disabled due to flakiness in 8e3c0df. I've filed bazelbuild#16185 to reenable them.
These tests were disabled due to flakiness in 8e3c0df. As a consequence, we introduced an undetected regression in 5.3 (bazelbuild#16171). I've rerun the tests multiple times, both locally and remotely, and I've seen no evidence of flakiness. Therefore, I'm declaring that this issue has mysteriously fixed itself. Notes: * Use `python3` instead of `python` to run the UDS proxy, as the latter is no longer available in some of the environments where the tests run. * The correct URI syntax for a Unix domain socket URI is unix:///path/to/socket rather than unix:/path/to/socket, as the URI technically has no host component. The latter form still works due to lenient parsing in the gRPC gRPC library, but this may change in the future, so the tests should exercise the correct form. Fixes bazelbuild#16185.
URIs such as `unix://...` (Unix domain socket) may legitimately be missing a host component. We should gracefully handle them when looking for a credential helper (no such helper can exist for them since the lookup is host-based, but this could change in the future). Fixes #16171 which is a regression in 5.3. We had some tests for this in remote_execution_test.sh, but they were disabled due to flakiness in 8e3c0df. I've filed #16185 to reenable them. Closes #16184. PiperOrigin-RevId: 470724658 Change-Id: Ibd7203546f00fe2335c14e7a5fa6d5bf64868bac Co-authored-by: Tiago Quelhas <[email protected]>
URIs such as `unix://...` (Unix domain socket) may legitimately be missing a host component. We should gracefully handle them when looking for a credential helper (no such helper can exist for them since the lookup is host-based, but this could change in the future). Fixes bazelbuild#16171 which is a regression in 5.3. We had some tests for this in remote_execution_test.sh, but they were disabled due to flakiness in 8e3c0df. I've filed bazelbuild#16185 to reenable them. Closes bazelbuild#16184. PiperOrigin-RevId: 470724658 Change-Id: Ibd7203546f00fe2335c14e7a5fa6d5bf64868bac
We use a unix domain socket for our remote_executor and remote_cache endpoints, because we use a side-car proxy to handle SPIFFE mTLS negotiation. This stopped working with the release of bazel 5.3.0. With 5.3.0, bazel fails before issuing its first GetCapabilitiesRequest. The error message was "ERROR: Failed to query remote execution capabilities: UNAUTHENTICATED: Failed computing credential metadata." When enabling verbose failures, the stack trace pointed to a failed precondition because the host portion of the URI was null. I played around with the code and got it working again, but the fix isn't production-quality -- it doesn't handle the case where the remote cache endpoint differs from the remote executor endpoint, and might need different credentials. Because we're using unix domain sockets, this shouldn't affect us at all. Anyway, consider the following patch a place to start the conversation about what the correct fix might look like.
The text was updated successfully, but these errors were encountered: