Eclipse Che Installation in AWS EC2, failure to start a workspace #13690

svkr2k · 2019-07-04T01:31:39Z

Description

Trying to run che-server:7.0.0-RC-2.0. che-server installs and runs fine, able to view dashboard and stacks/devfiles.

But, when i try to create a workspace, it fails. The logs are given below.

Some of the experts in a forum feel that the che master pod is unable to communicate with the theia pod.

In local PC everyting works fine and able to create workspaces.
But, in AWS EC2, dashboard comes up, but creating workspace fails.

Reproduction Steps

minikube start --cpus 2 --memory 4096 --extra-config=apiserver.authorization-mode=RBAC --vm-driver=none
kubectl create clusterrolebinding add-on-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:default
kubectl create serviceaccount tiller --namespace kube-system
kubectl apply -f ./tiller-rbac.yaml
helm init --service-account tiller --wait
minikube addons enable ingress
helm dependencies update --skip-refresh ./
minikube ip
export CHE_DOMAIN=$(minikube ip)
helm upgrade --install che --force --namespace che --set global.ingressDomain=${CHE_DOMAIN}.nip.io --set cheImage=eclipse/che-server:7.0.0-RC-2.0 ./

OS and version:
Ubuntu 18.04
AWS EC2 instance

Diagnostics:
Here is the log:

2019-07-02 13:34:48,148[nio-8080-exec-5]  [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 433]   - Starting workspace 'che/wksp-0m6t' with id 'workspaceh47e7hjup6tqzgqd' by user 'che'
2019-07-02 13:35:22,846[nio-8080-exec-2]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 145]  - Web socket session error
2019-07-02 13:35:22,847[nio-8080-exec-2]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 129]  - Closing unidentified session`
2019-07-02 13:39:18,077[aceSharedPool-1]  [WARN ] [.i.k.KubernetesInternalRuntime 245]  - Failed to start Kubernetes runtime of workspace workspaceh47e7hjup6tqzgqd. Cause: Server 'theia' in machine 'theia-ide3dy' not available.
2019-07-02 13:39:18,834[aceSharedPool-1]  [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 856]   - Workspace 'che:wksp-0m6t' with id 'workspaceh47e7hjup6tqzgqd' start failed

The text was updated successfully, but these errors were encountered:

tsmaeder · 2019-07-04T14:12:14Z

@svkr2k how are you creating and starting the workspace?

benoitf · 2019-07-04T14:20:14Z

I think it's a duplicate of #13435

tsmaeder · 2019-07-05T08:26:24Z

@benoitf Looks likey, yes. The logs here and on the other bug look uninteresting. What additional info could we request that would help us pin down the issue?

tsmaeder · 2019-07-05T15:31:48Z

Tentatively marking this as 7.0.0, not starting is not good.

l0rd · 2019-07-08T09:19:26Z

I suspect that minikube ip doesn't provide the good IP address in the case of EC2. I mean it provides the internal IP but not the public one. See EC2 user guide for more details.

@skabashnyuk @sleshchenko do you think that this may be the problem? Do you have other ideas about why wsmaster and theia cannot communicate?

svkr2k · 2019-07-12T08:43:49Z

@svkr2k how are you creating and starting the workspace?

@tsmaeder , thank you for your question.
I was able to open http://che-che.ip-address.nip.io and see the che dashboard.
In the dashboard, i selected a stack from the list and clicked on Create workspace button to create the workspace.

Hope this helps. Thank you for looking into this. I would really be happy if i get a solution for the issue.
I see the same issue described above using the version 7.0.0-rc-3.0 too.
(Also, kindly add this to 7.0.0 milestone)

skabashnyuk · 2019-07-12T09:30:57Z

@svkr2k just to clarify your environment. Are you running Che locally with minikube or on AWS?

andy316x · 2019-07-12T19:38:02Z

I experience this issue if running in multi-user mode, interestingly if I deploy Che in single-user mode Theia loads successfully, however it is then not able to create terminals. I am also using Che in AWS.

benoitf · 2019-07-12T20:33:04Z

@andy316x with which browser for terminals ? There is an issue with firefox

andy316x · 2019-07-14T17:00:36Z

@benoitf I am using Chrome so not sure if that is the problem

svkr2k · 2019-07-16T03:38:19Z

@skabashnyuk ,

@svkr2k just to clarify your environment. Are you running Che locally with minikube or on AWS?

I'm, running che on AWS (ubuntu 16.04) in single-user mode and created workspace using any of the already available stack/devfile from the dashboard.

(Kindly note that when i run che in locally in ubuntu or windows PC, it works fine.)

skabashnyuk · 2019-07-16T06:25:37Z

@svkr2k I have no AWS EC2 environment available to test. Can you try to install che with https://github.com/che-incubator/chectl ?

svkr2k · 2019-07-16T08:14:16Z

@svkr2k just to clarify your environment. Are you running Che locally with minikube or on AWS?

Hi @skabashnyuk , to clarify further: I'm running Che on AWS "with minikube".

svkr2k · 2019-07-16T08:39:56Z

Hi all,
I have a question regarding the above issue:
While creating a workspace from dashboard, will the che-host try to communicate with workspace container using .nip.io url?

skabashnyuk · 2019-07-16T09:27:26Z

@svkr2k can you try the way to install che how @l0rd suggested here? #13838 (comment)

svkr2k · 2019-07-16T16:04:38Z

@skabashnyuk , sure, i shall try that : #13838 (comment)

Also, please note that in AWS EC2 instance, i start minikube using the following:
sudo minikube start --memory=4096 --vm-driver=none

svkr2k · 2019-07-16T16:19:34Z

@svkr2k can you try the way to install che how @l0rd suggested here? #13838 (comment)

Thank you very much for providing inputs.

Tried to run che using the following commands:

sudo minikube start --memory=4096 --vm-driver=none

sudo kubectl create secret generic che-tls [email protected]

sudo chectl server:start --installer=helm --multiuser --platform=k8s --tls --self-signed-cert --domain=10.9.2.247.nip.io --cheimage=eclipse/che-server:7.0.0-rc-3.0

Here is the error i got:

~$ sudo chectl server:start --installer=helm --multiuser --platform=k8s --tls --self-signed-cert --domain=10.9.2.247.nip.io --cheimage=eclipse/che-server:7.0.0-rc-3.0
sudo: unable to resolve host ip-10-9-2-247
✔ ✈️ Kubernetes preflight checklist
✔ Verify if kubectl is installed
✔ Verify remote kubernetes status...done.
✔ Verify domain is set...set to 10.9.2.247.nip.io.
❯ 🏃‍ Running Helm to install Che
✔ Verify if helm is installed
✔ Check for TLS prerequisites che-tls secret exist.
✔ Create Tiller Role Binding...it already exist.
✔ Create Tiller Service Account...it already exist.
✔ Create Tiller RBAC
✔ Create Tiller Service...it already exist.
✔ Preparing Che Helm Chart...done.
✔ Updating Helm Chart dependencies...done.
✖ Deploying Che Helm Chart
→ Unable to execute helm command helm upgrade --install che --force --namespace che --set global.ingressDomain=10.9.2.247.nip.io --set global.cheDomain=10.9.2.247.nip.io --set cheImage=eclipse/che-server:7.0.0-rc-3.0 --set global.c
…
Error: Unable to execute helm command helm upgrade --install che --force --namespace che --set global.ingressDomain=10.9.2.247.nip.io --set global.cheDomain=10.9.2.247.nip.io --set cheImage=eclipse/che-server:7.0.0-rc-3.0 --set global.cheWorkspacesNamespace=che -f /home/ubuntu/.cache/chectl/templates/kubernetes/helm/che/values/multi-user.yaml -f /home/ubuntu/.cache/chectl/templates/kubernetes/helm/che/values/tls.yaml /home/ubuntu/.cache/chectl/templates/kubernetes/helm/che/ / Error: validation failed: [unable to recognize "": no matches for kind "Certificate" in version "certmanager.k8s.io/v1alpha1", unable to recognize "": no matches for kind "ClusterIssuer" in version "certmanager.k8s.io/v1alpha1"]
at HelmHelper. (/snapshot/chectl/lib/installers/helm.js:0:0)
at Generator.next ()
at fulfilled (/snapshot/chectl/node_modules/tslib/tslib.js:107:62)

Additional information:
sudo helm version
Client: &version.Version{SemVer:"v2.14.1", GitCommit:"5270352a09c7e8b6e8c9593002a73535276507c0", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.14.1", GitCommit:"5270352a09c7e8b6e8c9593002a73535276507c0", GitTreeState:"clean"}

benoitf · 2019-07-16T20:13:46Z

$ kubectl create namespace cert-manager
$ kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
$ kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v0.8.1/cert-manager.yaml --validate=false

svkr2k · 2019-07-17T01:23:41Z

Greetings, @benoitf , @skabashnyuk . Thank you very much !

Here are the steps i followed:

$ sudo minikube start --memory=4096 --vm-driver=none
$ sudo kubectl create secret generic che-tls [email protected]

$ kubectl create namespace cert-manager
$ kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
$ kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v0.8.1/cert-manager.yaml --validate=false

$ sudo chectl server:start --installer=helm --multiuser --platform=k8s --tls --self-signed-cert --domain=10.9.2.247.nip.io --cheimage=eclipse/che-server:7.0.0-rc-3.0

Here are the other errors that i got:

~$ sudo chectl server:start --installer=helm --multiuser --platform=k8s --tls --self-signed-cert --domain=10.9.2.247.nip.io --cheimage=eclipse/che-server:7.0.0-rc-3.0
sudo: unable to resolve host ip-10-9-2-247
  ✔ ✈️  Kubernetes preflight checklist
    ✔ Verify if kubectl is installed
    ✔ Verify remote kubernetes status...done.
    ✔ Verify domain is set...set to 10.9.2.247.nip.io.
  ✔ 🏃‍  Running Helm to install Che
    ✔ Verify if helm is installed
    ✔ Check for TLS prerequisites che-tls secret exist.
    ✔ Create Tiller Role Binding...it already exist.
    ✔ Create Tiller Service Account...it already exist.
    ✔ Create Tiller RBAC
    ✔ Create Tiller Service...it already exist.
    ✔ Preparing Che Helm Chart...done.
    ✔ Updating Helm Chart dependencies...done.
    ✔ Deploying Che Helm Chart...done.
  ❯ ✅  Post installation checklist
    ❯ PostgreSQL pod bootstrap
      ✖ scheduling
        → ERR_TIMEOUT: Timeout set to pod wait timeout 300000. podExist: false, currentPhase: undefined
        downloading images
        starting
      Keycloak pod bootstrap
      Che pod bootstrap
      Retrieving Che Server URL
      Che status check
Error: ERR_TIMEOUT: Timeout set to pod wait timeout 300000. podExist: false, currentPhase: undefined
    at KubeHelper.<anonymous> (/snapshot/chectl/lib/api/kube.js:0:0)
    at Generator.next (<anonymous>)
    at fulfilled (/snapshot/chectl/node_modules/tslib/tslib.js:107:62)

~$ sudo kubectl get pod --namespace che
sudo: unable to resolve host ip-10-9-2-247
NAME                        READY   STATUS    RESTARTS   AGE
che-67777bbb5b-r6r4h        0/1     Running   21         102m
keycloak-5cf455c44d-lx4hl   1/1     Running   0          102m
postgres-6c4d6c764c-5vqw2   1/1     Running   0          102m

~$ sudo kubectl get events --namespace che  -o custom-columns=TIMESTAMP:lastTimestamp,TYPE:type,MESSAGE:message -w
sudo: unable to resolve host ip-10-9-2-247
TIMESTAMP              TYPE      MESSAGE
2019-07-17T04:58:26Z   Normal    Pulling image "eclipse/che-server:7.0.0-rc-3.0"
2019-07-17T05:13:56Z   Warning   Readiness probe failed: HTTP probe failed with statuscode: 500
2019-07-17T04:38:25Z   Warning   Liveness probe failed: HTTP probe failed with statuscode: 500
2019-07-17T05:08:31Z   Warning   Back-off restarting failed container

svkr2k · 2019-07-17T15:33:12Z

Hi @benoitf , @skabashnyuk .
I believe that the urls used for communicating with the containers could be based on .nip.io.
As an alternative, could you kindly send me a link to the documentation or steps to setup che to use domain name based urls instead of .nip.io ?
Something similar to "http://che-che.myapp.mydomain.com"?
Kindly let me know how to setup like that, so that i can test that too... Thank you for your help.

skabashnyuk · 2019-07-18T07:13:18Z

@svkr2k I believe that this is -b parameter of chectl https://www.eclipse.org/che/docs/che-7/che-quick-starts.html

skabashnyuk · 2019-07-18T07:15:24Z

@benoitf when you made your test on AWS did you use minikube aswell?

l0rd · 2019-07-23T16:35:28Z

@svkr2k when you first opened this issue you tried to install single user Che with TLS disabled. At some point though you tested with TLS and multiuser. Have you been able to solve your original problem (Che single user without tls on AWS)?

I am asking because we have some open issues about deploying Che with TLS using helm charts (hence this would be a duplicate) and I would close this one if it your original issue was fixed.

l0rd · 2019-07-23T16:37:53Z

@rhopp @slemeur @tsmaeder @nickboldt I have removed the priority and the milestone here because we don't know yet if that's a duplicate, an issue that has been solved or a brand new one.

svkr2k · 2019-07-24T05:14:41Z

Hi @l0rd , Thank you.
the original issue was due to the .nip.io urls being blocked in my domain. Now, i'm able to create a workspace as we have 'unblocked' nip.io based url.

But, my goal is to use hostname-based url for che. we could achieve this using defaulthost.yaml while running the helm upgrade.
sudo helm upgrade --install che --namespace che -f ./values/default-host.yaml --set global.ingressDomain=myapp.mydomain.com ./

It would be nice if we could get the recommended practise to host che in Single-user and Multi-user modes in AWS. When we used host-based url, i could not create the workspace as the communication with the created workspace container failed.

l0rd · 2019-07-24T08:14:12Z

@svkr2k what you are describing is an existing issue about default-host/single-host: #12971. Unfortunately that's something that we haven't solved yet. We will need a few weeks to fix it properly.

But, if multi-host is an option (i.e. you are ok to use a wildcard SSL certificate), it's possible to configure it with a workaround. We have an ongoing PR to avoid the workaround and make it easier to deploy Che on AWS and GCP: #12971.

And in the meantime we are writing the documentation as well but we are kind of blocked by the issues above.

mshaposhnik · 2019-07-29T10:42:34Z

So I'm closing current issue since it will be fixed by #12971

tsmaeder added the status/info-needed More information is needed before the issue can move into the “analyzing” state for engineering. label Jul 4, 2019

tsmaeder added this to the 7.0.0 milestone Jul 5, 2019

l0rd added status/analyzing An issue has been proposed and it is currently being analyzed for effort and implementation approach team/platform and removed status/info-needed More information is needed before the issue can move into the “analyzing” state for engineering. labels Jul 8, 2019

l0rd removed this from the 7.0.0 milestone Jul 8, 2019

l0rd assigned skabashnyuk Jul 8, 2019

l0rd added the severity/P1 Has a major impact to usage or development of the system. label Jul 8, 2019

sunix mentioned this issue Jul 15, 2019

Theia IDE fails on OpenShift Origin #13435

Closed

slemeur added this to the 7.0.0 milestone Jul 16, 2019

slemeur added the kind/bug Outline of a bug - must adhere to the bug report template. label Jul 16, 2019

l0rd mentioned this issue Jul 16, 2019

Che 7.0.0 Endgame Plan #13637

Closed

85 tasks

l0rd added the status/info-needed More information is needed before the issue can move into the “analyzing” state for engineering. label Jul 23, 2019

l0rd removed this from the 7.0.0 milestone Jul 23, 2019

l0rd removed the severity/P1 Has a major impact to usage or development of the system. label Jul 23, 2019

skabashnyuk mentioned this issue Jul 25, 2019

Platform-2019-08-13 (Sprint: 170) #13847

Closed

25 tasks

mshaposhnik closed this as completed Jul 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eclipse Che Installation in AWS EC2, failure to start a workspace #13690

Eclipse Che Installation in AWS EC2, failure to start a workspace #13690

svkr2k commented Jul 4, 2019 •

edited by l0rd

Loading

tsmaeder commented Jul 4, 2019

benoitf commented Jul 4, 2019

tsmaeder commented Jul 5, 2019

tsmaeder commented Jul 5, 2019

l0rd commented Jul 8, 2019

svkr2k commented Jul 12, 2019 •

edited

Loading

skabashnyuk commented Jul 12, 2019

andy316x commented Jul 12, 2019

benoitf commented Jul 12, 2019

andy316x commented Jul 14, 2019

svkr2k commented Jul 16, 2019 •

edited

Loading

skabashnyuk commented Jul 16, 2019

svkr2k commented Jul 16, 2019

svkr2k commented Jul 16, 2019

skabashnyuk commented Jul 16, 2019

svkr2k commented Jul 16, 2019 •

edited

Loading

svkr2k commented Jul 16, 2019 •

edited

Loading

benoitf commented Jul 16, 2019 •

edited

Loading

svkr2k commented Jul 17, 2019 •

edited

Loading

svkr2k commented Jul 17, 2019

skabashnyuk commented Jul 18, 2019

skabashnyuk commented Jul 18, 2019

l0rd commented Jul 23, 2019

l0rd commented Jul 23, 2019

svkr2k commented Jul 24, 2019 •

edited

Loading

l0rd commented Jul 24, 2019 •

edited

Loading

mshaposhnik commented Jul 29, 2019

Eclipse Che Installation in AWS EC2, failure to start a workspace #13690

Eclipse Che Installation in AWS EC2, failure to start a workspace #13690

Comments

svkr2k commented Jul 4, 2019 • edited by l0rd Loading

Description

Reproduction Steps

tsmaeder commented Jul 4, 2019

benoitf commented Jul 4, 2019

tsmaeder commented Jul 5, 2019

tsmaeder commented Jul 5, 2019

l0rd commented Jul 8, 2019

svkr2k commented Jul 12, 2019 • edited Loading

skabashnyuk commented Jul 12, 2019

andy316x commented Jul 12, 2019

benoitf commented Jul 12, 2019

andy316x commented Jul 14, 2019

svkr2k commented Jul 16, 2019 • edited Loading

skabashnyuk commented Jul 16, 2019

svkr2k commented Jul 16, 2019

svkr2k commented Jul 16, 2019

skabashnyuk commented Jul 16, 2019

svkr2k commented Jul 16, 2019 • edited Loading

svkr2k commented Jul 16, 2019 • edited Loading

benoitf commented Jul 16, 2019 • edited Loading

svkr2k commented Jul 17, 2019 • edited Loading

svkr2k commented Jul 17, 2019

skabashnyuk commented Jul 18, 2019

skabashnyuk commented Jul 18, 2019

l0rd commented Jul 23, 2019

l0rd commented Jul 23, 2019

svkr2k commented Jul 24, 2019 • edited Loading

l0rd commented Jul 24, 2019 • edited Loading

mshaposhnik commented Jul 29, 2019

svkr2k commented Jul 4, 2019 •

edited by l0rd

Loading

svkr2k commented Jul 12, 2019 •

edited

Loading

svkr2k commented Jul 16, 2019 •

edited

Loading

svkr2k commented Jul 16, 2019 •

edited

Loading

svkr2k commented Jul 16, 2019 •

edited

Loading

benoitf commented Jul 16, 2019 •

edited

Loading

svkr2k commented Jul 17, 2019 •

edited

Loading

svkr2k commented Jul 24, 2019 •

edited

Loading

l0rd commented Jul 24, 2019 •

edited

Loading