-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 2097830: macOS: certificate is untrusted error #1207
Conversation
@deejross: This pull request references Bugzilla bug 2097830, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/bugzilla refresh |
@deejross: This pull request references Bugzilla bug 2097830, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/bugzilla refresh |
@deejross: This pull request references Bugzilla bug 2097830, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@@ -136,6 +137,10 @@ func (o *LoginOptions) getClientConfig() (*restclient.Config, error) { | |||
// try to TCP connect to the server to make sure it's reachable, and discover | |||
// about the need of certificates or insecure TLS | |||
if err := dialToServer(*clientConfig); err != nil { | |||
if strings.Contains(err.Error(), "certificate is not trusted") { | |||
err = x509.CertificateInvalidError{Reason: x509.Expired, Detail: err.Error()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am new to the code so my comment in the PR may be off. The BZ reports "Login is not successful and error message is seen". When you wrap the error with x509.CertificateInvalidError
can I assume the error is handled properly as before? So the BZ issue is really about properly handling "certificate is not trusted" so the oc code can choose a different way of going around the untrusted certificate? What's the connection between forcing x509.Expired
and the actual "certificate is not trusted" error message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating the x509.CertificateInvalidError
causes the switch case
to catch this like any other invalid certificate issue. The idea was to use the best descriptive error
type, and unfortunately, x509.CertificateInvalidError.Error()
method only outputs the Detail
string when a certain Reason
is set. In this case, x509.Expired
was the closest reason I could find.
I haven't found anything in kubectl that handles this yet, but oc and kubectl have very different authentication mechanisms. I agree that reusing x509.CertificateInvalidError
in such a way is not ideal. I was trying to keep the number of changes as small as possible, but I may have to create a custom error for this condition to better handle it and reduce the user confusion of getting both an "expired" and "not trusted" message in the same error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it is hacky, we might move forward with that fix by adding some todo comment with the real bug in upstream. Whenever upstream bug is fixed, we can remove that patch.
Are we sure that fixes the problem exactly?, and once user successfully logs in, there won't be any other cert issues for other commands?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this on a Mac, and it does resolve the issue for logins, I'm not 100% sure about other commands though. It should be possible to replicate this fix globally if it is an issue outside of login.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on https://go-review.googlesource.com/c/go/+/418835 the Go release cycle entered the code freeze phase (Nov-Jan). So I don't expect the fix getting reviews anytime soon. Though, it's still worth reaching the upstream Go community to get their attention so they are aware of the code change request. It seems noone is aware of the PR. @deejross worth checking the thread in https://groups.google.com/g/golang-dev/c/K7oGURi0wTM/. Comment from Ian in https://groups.google.com/g/golang-dev/c/K7oGURi0wTM/m/HTyfcmY0BwAJ:
In general it should not be necessary to ask for an additional Googler
review. If your CL shows up on https://go.dev/s/needs-review, then
some Googler will get to it soon.
If your CL does not show up on that list, then most likely it has an
unresolved comment, or the trybots have not been run.
Please do ping if your CL is not on that list and you don't understand
why, or if it has been on the list for several days. But in general
pinging should not be required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When it comes to resolving this issue. @deejross if you are confident this change fixes the login issue without any negative side effects, let's proceed. @kasturinarra would you please perform pre-merge testing on all OSes for the oc login
? Just to be sure we are not breaking anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, we can put a TODO explaining why we did this and when we can remove this hacky solution with the links.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ingvagabund thanks, i will ask @zhouying7780 to take a look at it, thanks !!
Found the upstream bug: golang/go#52010 Hopefully this gets fixed and we can close this PR. |
I have submitted a PR upstream to resolve this issue, so hopefully this PR can get closed: golang/go#53986 |
@@ -136,6 +137,10 @@ func (o *LoginOptions) getClientConfig() (*restclient.Config, error) { | |||
// try to TCP connect to the server to make sure it's reachable, and discover | |||
// about the need of certificates or insecure TLS | |||
if err := dialToServer(*clientConfig); err != nil { | |||
if strings.Contains(err.Error(), "certificate is not trusted") { | |||
err = x509.CertificateInvalidError{Reason: x509.Expired, Detail: err.Error()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, we can put a TODO explaining why we did this and when we can remove this hacky solution with the links.
@@ -136,6 +137,10 @@ func (o *LoginOptions) getClientConfig() (*restclient.Config, error) { | |||
// try to TCP connect to the server to make sure it's reachable, and discover | |||
// about the need of certificates or insecure TLS | |||
if err := dialToServer(*clientConfig); err != nil { | |||
if strings.Contains(err.Error(), "certificate is not trusted") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we do this workaround only if the err is not type x509.CertificateInvalidError
;
if strings.Contains(err.Error(), "certificate is not trusted") {
if _, ok := err.(x509.CertificateInvalidError); !ok {
}
}
Otherwise, in the future when problem is fixed. We'll still be wrapping the correct error message to the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more the fix gets executed only for Mac the better. The smaller the error space affected the better.
0e24036
to
0aae5f4
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: deejross The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@deejross: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Actually I think I'm hitting this problem in a different codepath. oc/pkg/helpers/tokencmd/request_token.go Lines 186 to 197 in 142cb44
and transportWithSystemRoots has this code:oc/pkg/helpers/tokencmd/request_token.go Lines 442 to 480 in 142cb44
which returns a certificate is not trusted error which is not currently caught.
|
The upstream changes on Mac that lead to this were quite extensive, so I'm not surprised there would be issues with some use cases. The real question becomes, do we push upstream to resolve them, or do we try and fix oc-specific use cases ourselves? |
This all depends on how fast a fix would land upstream, and be backported to 1.18/1.19 releases. My feeling is that this would take time (say more than a few weeks)? An alternative could be to get the a patch in the fedora/RHEL packages, but that might be a hard sell. In short, I would pursue both. The oc changes would be a short-term fix and it would avoid annoying user-visible failures on macOS. The real fix belongs in go, but it may take some time to be fixed, so this is going to be a longer term fix. Once the issue is fixed in the go compiler(s) we use to build oc releases, the oc changes can be removed. |
In my opinion, we need to fix that problem in oc. Because even we fix it in Go, that will not mean that some versions of oc(4.12) will include fix. |
This is also impacting odo, and users of odo redhat-developer/vscode-openshift-tools#2693 |
I rebuilt odo with the patch, it helps a bit, but is not enough.
|
I did the same change as in this PR in a few more places in https://github.com/cfergeau/oc/commits/macos-cert-not-trusted , see commit cfergeau@d6ee395 This fixes both crc and odo when I patch them to use this branch. |
/hold |
Can you explain what the follow-up actions will be? As it is currently held, what does that mean for a possible solution? |
Newer go versions should no longer have this problem, see golang/go#56891 |
golang/go#57427 mentions 1.19.5 |
1.18.10, 1.19.5 and 1.20 should all have this fixed. |
Fixed in golang 1.19.5 |
@ingvagabund: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@deejross: This pull request references Bugzilla bug 2097830. The bug has been updated to no longer refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
On Mac, attempting to connect to an API server with a SHA1 signature (even if it also has a SHA256 signature) will cause
oc login
to respond with:certificate is not trusted
This error is of type
*errors.errorString
, so there's not a great way to detect it, other than to check the error message string. Worse yet, the error must be created (not wrapped) asx509.CertificateInvalidError
with reason asx509.Expired
in order to include the original error message. While the user will likely not see this error, it may appear in rare conditions.I'm open to suggestions on how to improve this.
/assign @soltysh