Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues starting workspaces when DevWorkspace is enabled #20785

Closed
4 tasks done
l0rd opened this issue Nov 16, 2021 · 12 comments
Closed
4 tasks done

Issues starting workspaces when DevWorkspace is enabled #20785

l0rd opened this issue Nov 16, 2021 · 12 comments
Labels
area/dashboard area/editor/theia Issues related to the che-theia IDE of Che area/plugin-registry area/plugins engine/devworkspace Issues related to Che configured to use the devworkspace controller as workspace engine. kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system.

Comments

@l0rd
Copy link
Contributor

l0rd commented Nov 16, 2021

Describe the bug

I wanted to test the OpenShift connector with DevWorkspace enabled. Using the project https://github.com/l0rd/spring-petclinic and a v2 devfiles. I have logged here the problems I have found.

Problem 1. Factory URL devfilePath parameter is ignored

I am trying with <che-host>/f?url=https://github.com/l0rd/spring-petclinic&devfilePath=.devfilev2.yaml
The issues:

  1. The specified devfile is not loaded
  2. A warning should be shown if the devfilePath is not found
  3. The is no documentation about this parameter

fixed by: the URL should be <che-host>#https://github.com/l0rd/spring-petclinic?devfilePath=.devfilev2.yaml

Problem 2. Che-Theia hangs for 10s before showing source code

image

And also the first project should be expanded and the "open editors" collapsed.

Problem 3. Misspelled VSX in extensions.json silently fails to load

I spent some time to understand that I had mispelled a VSX in .vscode/extension.json.

The issues:

  1. We should inform user if .vscode/extension.json has been found or not
  2. We should inform user that an extension in .vscode/extension.json has been downloaded
  3. We should warn user if an extension in .vscode/extension.json could not be downloaded`

Problem 4. VSX in extensions.json are not shown as installed Che-Theia plugin

"redhat.java" is in .vscode/extensions.json and is loaded successfully but the list of Che-Theia plugins is empty

Problem 5. There is no link to Che dashboard in Che-Theia

Now that the small yellow arrow is not there anymore, users do not have any link to the dashboard. That's annoying especially if the workspace had been started from a factory link and the user never opened the dashboard.

Problem 6. Che-Theia plugins that include a volume don’t work out of the box

For example redhat.vscode-openshift-connector fails to start because a volume named kube doesn’t exist. The workaround is that the user adds the volume in the devfile.yaml. We should either remove the volume from the plugin or include the volume component definition in the plugin definition.

Problem 7. When 2 workspaces have the same project, the second fails to clone it

To reproduce: start a workspace using a factory link. Stop and delete the workspace. Start a new workspace using the same factory link.

There are no files in the project in Che-Theia.

From the project clone init container:

klo workspace0386015ab9cf4e16-57c9997bcb-5ww7z -c project-clone
2021/11/16 18:03:18 Read DevWorkspace at /devworkspace-metadata/flattened.devworkspace.yaml
2021/11/16 18:03:18 Processing project spring-petclinic
2021/11/16 18:03:18 Project 'spring-petclinic' is already cloned and has all remotes configured

Problem 8. The built-in resource monitor plugin status bar addition is not there

Steps to reproduce

  1. Provision an OpenShift cluster with cluster bot
  2. chectl update next && chectl server:deploy --platform=openshift --workspace-engine=dev-workspace
  3. Login with the che user created by the Che operator
  4. Use https://github.com/l0rd/spring-petclinic
@l0rd l0rd added the kind/bug Outline of a bug - must adhere to the bug report template. label Nov 16, 2021
@benoitf
Copy link
Contributor

benoitf commented Nov 16, 2021

Url should be

 <che-host>#https://github.com/l0rd/spring-petclinic?devfilePath=.devfilev2.yaml

@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Nov 16, 2021
@RomanNikitenko RomanNikitenko added area/editor/theia Issues related to the che-theia IDE of Che engine/devworkspace Issues related to Che configured to use the devworkspace controller as workspace engine. severity/P1 Has a major impact to usage or development of the system. and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Nov 17, 2021
@l0rd l0rd added severity/P1 Has a major impact to usage or development of the system. and removed severity/P1 Has a major impact to usage or development of the system. labels Nov 17, 2021
@azatsarynnyy
Copy link
Member

I think it's rather an epic that touches several areas than a bug.
As I see there're problems related to the dashboard, che-theia, plugins, plugin-registry areas.

@amisevsk
Copy link
Contributor

Regarding problem 7, I wasn't able to reproduce it in pure DWO, but I also don't know how any other component could have an impact here.

@svor
Copy link
Contributor

svor commented Nov 21, 2021

Issue about the problem 8 (Resources Monitor plugin): #20800

@benoitf
Copy link
Contributor

benoitf commented Nov 23, 2021

Problem 7:

I started a workspace using https://<che-host>#https://github.com/che-samples/java-spring-petclinic/tree/devfilev2
stopped and deleted the workspace and then started again using the same factory link
but the project has been cloned as expected

@l0rd if you can reproduce the problem or share the factory link you used ?

@l0rd
Copy link
Contributor Author

l0rd commented Nov 29, 2021

@benoitf I can reproduce problem n7 using factory link $CHE_HOSTNAME/#https://github.com/l0rd/spring-petclinic on a cluster bot default cluster. I could not reproduce it on minikube.

Project clone init container logs (note that it was the first time I started that workspace):

$ kubectl logs workspaced648d9fa8ab249eb-69569d5768-bwb8w -c project-clone                                                                                                     
2021/11/29 18:01:21 Read DevWorkspace at /devworkspace-metadata/flattened.devworkspace.yaml
2021/11/29 18:01:21 Processing project spring-petclinic
2021/11/29 18:01:21 Project 'spring-petclinic' is already cloned and has all remotes configured

@amisevsk
Copy link
Contributor

@l0rd This could be the result of the project-clone container running twice -- e.g. the flow could be

  1. Project-clone container runs, does some part of the clone process, ends up exiting with nonzero exit code
  2. Project-clone init container is restarted, sees existing folder, and exits 0, allowing workspace to start successfully

The only way I can think of to detect this is potentially either watching logs in the OpenShift UI or repeatedly running kubectl logs <pod-name> -c project-clone as the workspace is starting to try to see the first run's output.

@benoitf
Copy link
Contributor

benoitf commented Nov 30, 2021

so tried on cluster-bot and was able to reproduce the error

When starting the new workspace, there is an error in the controller after starting the second workspace

so, there is a pod starting and project-clone is launched but then that pod is stopped with a failure in the controller:

"level":"info","ts":1638280304.112614,"logger":"controllers.DevWorkspace","msg":"Error updating workspace status: Operation cannot be fulfilled on devworkspaces.workspace.devfile.io \"spring-petclinic-ys0h\": the object has been modified; please apply your changes to the latest version and try again","Request.Namespace":"che-kube-admin-che-sbgf1v","Request.Name":"spring-petclinic-ys0h","devworkspace_id":"workspace9f736d46dfb94c32"}
{"level":"error","ts":1638280304.11267,"logger":"controller","msg":"Reconciler error","reconcilerGroup":"workspace.devfile.io","reconcilerKind":"DevWorkspace","controller":"devworkspace","name":"spring-petclinic-ys0h","namespace":"che-kube-admin-che-sbgf1v","error":"Operation cannot be fulfilled on devworkspaces.workspace.devfile.io \"spring-petclinic-ys0h\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:218\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:197\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:90"}

and then there is another deployment/pod recreated and it's in this second pod that we have the 'already cloned' message

@amisevsk
Copy link
Contributor

amisevsk commented Dec 1, 2021

The listed errors shouldn't impact things, as far as I know (they're just attempting to update out-of-date resources -- it's a no-op from the perspective of the cluster).

If the pod is being created, then terminated and recreated, it sounds like the deployment is changing for some reason. If you can get me credentials to a cluster that reproduces, I can look into it more.

@azatsarynnyy
Copy link
Member

The issue for investigating problem 2 - #20861

@amisevsk
Copy link
Contributor

amisevsk commented Dec 6, 2021

For Problem 7, the issue is happening because OpenShift is taking its sweet time provisioning pull secrets for the workspace's service account. The flow is:

  1. Create the service account
  2. Check service account for image pull secrets (there are none)
  3. Create deployment without image pull secrets
  4. Project clone begins to run
  5. OpenShift adds image pull secrets to the service account
  6. DWO detects image pull secrets, adds them to deployment
  7. Deployment rolls out a new pod, killing the old one while it's cloning.

This issue should be fixed by devfile/devworkspace-operator#700

@l0rd
Copy link
Contributor Author

l0rd commented Feb 20, 2022

Closing this issue. The only remaining problem is n.3 related to Theia but considered that our main attention is currently on VS Code and JetBrains we should not address it now.

@l0rd l0rd closed this as completed Feb 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dashboard area/editor/theia Issues related to the che-theia IDE of Che area/plugin-registry area/plugins engine/devworkspace Issues related to Che configured to use the devworkspace controller as workspace engine. kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system.
Projects
None yet
Development

No branches or pull requests

7 participants