-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix broken integration test TestRegularWorkspaceTasks #9575
Comments
In the current code, I encounter the certificate issue, waiting for PR #9553 fixed. Related Slack Thread. |
The supervisor opens the port on 22999 on IPv6 localhost address $ netstat -ntlp | grep 22999
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp6 0 0 :::22999 :::* LISTEN - Port forward to address 0.0.0.0 with local port 32999 to remote port 22999 $ kubectl port-forward --address=0.0.0.0 pod/ws-87285dbc-7400-4afb-85e8-a6de243cfe5c 32999:22999
Forwarding from 127.0.0.1:32999 -> 22999 Access with supervisor endpoint $ curl -XGET 'http://127.0.0.1:32999/_supervisor/v1/status/tasks'
curl: (52) Empty reply from server
$ k port-forward --address=0.0.0.0 pod/ws-87285dbc-7400-4afb-85e8-a6de243cfe5c 32999:22999
Forwarding from 0.0.0.0:32999 -> 22999
Handling connection for 32999
E0503 09:36:56.344879 45463 portforward.go:406] an error occurred forwarding 32999 -> 22999: error forwarding port 22999 to pod 88503d150548aaf02c1754d273767d2d9c0ffd1ad661a8819218ac91ad236753, uid : failed to execute portforward in network namespace "/var/run/netns/cni-1d91c723-efc9-a8d1-ea28-513326b839ed": failed to connect to localhost:22999 inside namespace "88503d150548aaf02c1754d273767d2d9c0ffd1ad661a8819218ac91ad236753", IPv4: dial tcp4 127.0.0.1:22999: connect: connection refused IPv6 dial tcp6: address localhost: no suitable address found
E0503 09:36:56.345241 45463 portforward.go:234] lost connection to pod I tried change the port-forward with $ kubectl port-forward --address=localhost pod/ws-87285dbc-7400-4afb-85e8-a6de243cfe5c 32999:22999
Forwarding from 127.0.0.1:32999 -> 22999
Forwarding from [::1]:32999 -> 22999 The supervisor opens the port on 22999 on IPv6 localhost address, and opened port 32999 on IPv4 and IPv6. netstat -tnlp | grep 32999
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 10.0.5.2:32999 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:32999 0.0.0.0:* LISTEN 45942/kubectl
tcp6 0 0 ::1:32999 :::* LISTEN 45942/kubectl Access with supervisor endpoint $ curl -XGET 'http://127.0.0.1:32999/_supervisor/v1/status/tasks'
curl: (52) Empty reply from server
$ curl -XGET 'http://[::1]:32999/_supervisor/v1/status/tasks'
curl: (52) Empty reply from server
$ kubectl port-forward --address=localhost pod/ws-87285dbc-7400-4afb-85e8-a6de243cfe5c 32999:22999
Forwarding from 127.0.0.1:32999 -> 22999
Forwarding from [::1]:32999 -> 22999
Handling connection for 32999
E0503 09:42:26.602006 46159 portforward.go:406] an error occurred forwarding 32999 -> 22999: error forwarding port 22999 to pod 88503d150548aaf02c1754d273767d2d9c0ffd1ad661a8819218ac91ad236753, uid : failed to execute portforward in network namespace "/var/run/netns/cni-1d91c723-efc9-a8d1-ea28-513326b839ed": failed to connect to localhost:22999 inside namespace "88503d150548aaf02c1754d273767d2d9c0ffd1ad661a8819218ac91ad236753", IPv4: dial tcp4 127.0.0.1:22999: connect: connection refused IPv6 dial tcp6: address localhost: no suitable address found
E0503 09:42:26.602432 46159 portforward.go:234] lost connection to pod @gitpod-io/engineering-workspace Any thought or comment? |
Hi, I am looking into this error. Will update this issue once I find something. |
Workspace logsThe workspace created through this test has the following logs. Notice last line ...
...
...
Web UI available at http://localhost:23000/
[15:06:26] Extension host agent started.
{"ide":"IDE","level":"info","message":"IDE readiness took 5.307 seconds","serviceContext":{"service":"supervisor","version":"commit-56fe9e79016d84f0e5acff880b6b1fc18e3e9707"},"severity":"INFO","time":"2022-05-04T15:06:26Z"}
{"ide":"IDE","level":"info","message":"IDE is ready","serviceContext":{"service":"supervisor","version":"commit-56fe9e79016d84f0e5acff880b6b1fc18e3e9707"},"severity":"INFO","time":"2022-05-04T15:06:26Z"}
{"@type":"type.googleapis.com/google.devtools.clouderrorreporting.v1beta1.ReportedErrorEvent","error":"not connected to Gitpod server","level":"error","message":"error tracking supervisor_readiness","serviceContext":{"service":"supervisor","version":"commit-56fe9e79016d84f0e5acff880b6b1fc18e3e9707"},"severity":"ERROR","time":"2022-05-04T15:06:26Z"} |
The port-forward will always fail for supervisor port. When kubectl tries to port forward it does that in context of a network namespace. This network namespace is the one which was originally created for the pod. This is the same nw namespace where Therefore, kubectl does not have access to the namespace where supervisor is running, hence the error. Proposed solution
correct me if I am wrong @gitpod-io/engineering-workspace |
Thanks Prince's analysys. |
One way to solve this could be run an exec command through kubectl inorder to expose the port. I still need to figure out how to do this and if this is feasible. Other way is to make use of integration test agent to enter relevant namespace and then run a curl on the supervisor endpoint. |
Could we not get the owner token of the workspace and talk to the supervisor API using the IDE URL? |
Thanks @csweichel. That sounds like the easier way to check supervisor status. I will use this approach. I disabled the supervisor bit and proceeded to check on the agent instrumentation part where it checks if the file from the init tasks are created. I saw |
Hi, this is a workspace network architecture. It may help you to investigate.
|
I encountered this as well. Debugging... |
If we switch back to core-dev env, running TestRegularworkspaceTasks against the core-dev env, no more |
Thanks for your remark @jenting . I tried looking for logs of api server for the k3s installation of preview env but could not see any errors pertaining to the calls that we make in our test. @gitpod-io/platform would you know why this would happen? I can work together with you to triage the problem. About using the ide URL: This does not work oob and needs more to be done. |
The unauthorized error is indeed affecting other tests as well e.g. gitpod/test/tests/components/ws-manager/content_test.go Lines 23 to 24 in 301190d
|
@princerachit this was marked as done, moving back to in-progress, not sure if you meant to mark as done. |
I think it's because I close this issue by accident 😅 (clicks the wrong button), and I forget to moving back to in-progress status, sorry about that. |
@princerachit If I comment out this line, the integration test pass on preview environment without So, what you assumption is correct, there is a bit different on k3s vs GKE cluster. |
I suspect this is probably because we have different method of authorization, with core-dev we use |
So here is the issue the When the method is set to access-token we can override the TLSConfig with connection info. However, when the method is client cert and data, overwriting the TLConfig removes the existing creds. Thus, we see the unauthorized error.
|
We could delete the line
because we are able to use secure TLS config to interact with kube-apiserver on core-dev env as well as preview env. |
Bug description
The integration test TestRegularWorkspaceTasks is broken with the errors
The analysis root cause could be #8800 (comment)
Steps to reproduce
Tasks
Workspace affected
No response
Expected behavior
No response
Example repository
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered: