Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: support aws new regions without sdk upgrades #1154

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

flavianmissi
Copy link
Member

No description provided.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 6, 2024
@openshift-ci openshift-ci bot requested a review from adambkaplan November 6, 2024 14:39
Copy link
Contributor

openshift-ci bot commented Nov 6, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: flavianmissi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 6, 2024
@flavianmissi flavianmissi removed the request for review from adambkaplan November 6, 2024 14:40
@deepsm007
Copy link
Contributor

infra issue managed-clonerefs: manifest unknown
retest should be fine

@flavianmissi flavianmissi force-pushed the support-aws-new-regions branch from c8c8ed2 to b40eeb3 Compare November 6, 2024 15:49
@wewang58
Copy link

wewang58 commented Dec 5, 2024

/retest

@wewang58
Copy link

wewang58 commented Dec 5, 2024

@flavianmissi Does the failure of e2e-hypershift job is related to the pr? if no I will add qe-approved label

@flavianmissi
Copy link
Member Author

The hypershift failures do seem related to the image registry, here's what I see in the logs of one of the runs:

2024-12-05T08:28:35.209665387Z time="2024-12-05T08:28:35.209596755Z" level=error msg="s3aws: WebIdentityErr: failed to retrieve credentials\ncaused by: SerializationError: failed to unmarshal error message\n\tstatus code: 405, request id: \ncaused by: UnmarshalError: failed to unmarshal error message\n\t00000000  3c 3f 78 6d 6c 20 76 65  72 73 69 6f 6e 3d 22 31  |<?xml version=\"1|\n00000010  2e 30 22 20 65 6e 63 6f  64 69 6e 67 3d 22 55 54  |.0\" encoding=\"UT|\n00000020  46 2d 38 22 3f 3e 0a 3c  45 72 72 6f 72 3e 3c 43  |F-8\"?>.<Error><C|\n00000030  6f 64 65 3e 4d 65 74 68  6f 64 4e 6f 74 41 6c 6c  |ode>MethodNotAll|\n00000040  6f 77 65 64 3c 2f 43 6f  64 65 3e 3c 4d 65 73 73  |owed</Code><Mess|\n00000050  61 67 65 3e 54 68 65 20  73 70 65 63 69 66 69 65  |age>The specifie|\n00000060  64 20 6d 65 74 68 6f 64  20 69 73 20 6e 6f 74 20  |d method is not |\n00000070  61 6c 6c 6f 77 65 64 20  61 67 61 69 6e 73 74 20  |allowed against |\n00000080  74 68 69 73 20 72 65 73  6f 75 72 63 65 2e 3c 2f  |this resource.</|\n00000090  4d 65 73 73 61 67 65 3e  3c 4d 65 74 68 6f 64 3e  |Message><Method>|\n000000a0  50 4f 53 54 3c 2f 4d 65  74 68 6f 64 3e 3c 52 65  |POST</Method><Re|\n000000b0  73 6f 75 72 63 65 54 79  70 65 3e 53 45 52 56 49  |sourceType>SERVI|\n000000c0  43 45 3c 2f 52 65 73 6f  75 72 63 65 54 79 70 65  |CE</ResourceType|\n000000d0  3e 3c 52 65 71 75 65 73  74 49 64 3e 53 37 44 5a  |><RequestId>S7DZ|\n000000e0  37 4d 41 46 42 51 44 32  33 58 35 4b 3c 2f 52 65  |7MAFBQD23X5K</Re|\n000000f0  71 75 65 73 74 49 64 3e  3c 48 6f 73 74 49 64 3e  |questId><HostId>|\n00000100  6e 6c 4a 2f 73 61 4a 6a  36 5a 59 4f 53 71 59 44  |nlJ/saJj6ZYOSqYD|\n00000110  34 77 78 71 6c 53 78 64  70 4b 65 73 77 6f 59 63  |4wxqlSxdpKeswoYc|\n00000120  38 56 77 65 36 62 72 39  4c 54 51 4f 6e 31 42 30  |8Vwe6br9LTQOn1B0|\n00000130  45 4d 59 6d 4d 75 47 76  45 37 79 33 67 64 74 45  |EMYmMuGvE7y3gdtE|\n00000140  73 4d 54 43 35 71 6c 34  31 46 63 3d 3c 2f 48 6f  |sMTC5ql41Fc=</Ho|\n00000150  73 74 49 64 3e 3c 2f 45  72 72 6f 72 3e           |stId></Error>|\n\ncaused by: unknown error response tag, {{ Error} []}" go.version="go1.22.7 (Red Hat 1.22.7-1.el9_5) X:strictfipsruntime"

So it seems that setting REGISTRY_STORAGE_S3_REGIONENDPOINT to https://s3.amazonaws.com in hypershift is causing the 405. I must find the time to look into this.

@wewang58
Copy link

wewang58 commented Dec 9, 2024

@flavianmissi Do you find any clue for the issue? or i can do something thing for it?

@flavianmissi
Copy link
Member Author

I haven't made progress here yet @wewang58. I'm planning to dig a little more today, thanks for the reminder!
I'll post an update here later today.

@flavianmissi
Copy link
Member Author

Here's the clean the output of the operator logs from my previous comment:

s3aws: WebIdentityErr: failed to retrieve credentials
caused by: SerializationError: failed to unmarshal error message
	status code: 405, request id:
caused by: UnmarshalError: failed to unmarshal error message
	00000000  3c 3f 78 6d 6c 20 76 65  72 73 69 6f 6e 3d 22 31  |
	<?xml version=\"1.0\" encoding=\"UTF-8\"?>.<Error><Code>MethodNotAllowed</Code><Message>The specified method is not allowed against this resource.</Message><Method>POST</Method><ResourceType>SERVICE</ResourceType><RequestId>S7DZ7MAFBQD23X5K</RequestId><HostId>nlJ/saJj6ZYOSqYD4wxqlSxdpKeswoYc8Vwe6br9LTQOn1B0EMYmMuGvE7y3gdtEsMTC5ql41Fc=</HostId></Error>
caused by: unknown error response tag, {{ Error} []}" go.version="go1.22.7 (Red Hat 1.22.7-1.el9_5) X:strictfipsruntime"

This didn't help much. I added a commit to this PR enabling debug logs in the s3 sdk in the hopes that we can get more details about the failure on the next hypershift job run. 🤞🏼

@flavianmissi flavianmissi force-pushed the support-aws-new-regions branch from 2b23b87 to 65c0fab Compare December 9, 2024 13:56
Copy link
Contributor

openshift-ci bot commented Dec 9, 2024

@flavianmissi: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-hypershift-conformance 65c0fab link true /test e2e-hypershift-conformance
ci/prow/e2e-aws-ovn-upgrade 65c0fab link true /test e2e-aws-ovn-upgrade
ci/prow/e2e-hypershift 65c0fab link true /test e2e-hypershift
ci/prow/hypershift-e2e-aks 65c0fab link false /test hypershift-e2e-aks
ci/prow/okd-scos-e2e-aws-ovn 65c0fab link false /test okd-scos-e2e-aws-ovn
ci/prow/unit 65c0fab link true /test unit
ci/prow/e2e-aws-operator 65c0fab link true /test e2e-aws-operator

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants