Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Executing kyuubi status returns 'Kyuubi is not running' in docker container #4515

Open
2 of 4 tasks
dnskr opened this issue Mar 14, 2023 · 8 comments
Open
2 of 4 tasks
Labels
kind:bug This is a clearly a bug priority:major

Comments

@dnskr
Copy link
Contributor

dnskr commented Mar 14, 2023

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the bug

I'm using the official docker image and helm chart to deploy and manage Kyuubi on Kubernetes.
I found that bin/kyuubi status returns Kyuubi is not running when the instance is actually up and running.

The PID value in the file should be 15, however it is 1301 for the main instance in this case:

kyuubi@kyuubi-5b6784d6b4-gf2d4:/opt/kyuubi$ bin/kyuubi status
Warn: Not find kyuubi environment file /opt/kyuubi/conf/kyuubi-env.sh, using default ones...
Kyuubi is not running

kyuubi@kyuubi-5b6784d6b4-gf2d4:/opt/kyuubi$ ls -la pid
total 16
drwxrwxr-x 1 root   root 4096 Mar 14 19:30 .
drwxrwxr-x 1 root   root 4096 Mar 14 19:41 ..
-rw-r--r-- 1 kyuubi root    5 Mar 14 19:30 kyuubi--org.apache.kyuubi.server.KyuubiServer.pid

kyuubi@kyuubi-5b6784d6b4-gf2d4:/opt/kyuubi$ cat pid/kyuubi--org.apache.kyuubi.server.KyuubiServer.pid 
1301

kyuubi@kyuubi-5b6784d6b4-gf2d4:/opt/kyuubi$ ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
kyuubi         1  0.0  0.0   7112  3388 ?        Ss   19:23   0:00 bash ./bin/kyuubi run
kyuubi        15  1.0 17.2 3933096 1020332 ?     Sl   19:23   0:26 /opt/java/openjdk/bin/java -cp /opt/kyuubi/jars/*:/opt/kyuubi/conf: org.apache.kyuubi.server.KyuubiServer
kyuubi      7952  0.0  0.0   7244  3956 pts/0    Ss   20:04   0:00 bash
kyuubi      8135  0.0  0.0   8896  3184 pts/0    R+   20:05   0:00 ps aux

kyuubi@kyuubi-5b6784d6b4-gf2d4:/opt/kyuubi$ ps -p 15
    PID TTY          TIME CMD
     15 ?        00:00:33 java

For the secondary instance there is no PID file at all:

kyuubi@kyuubi-5b6784d6b4-wpc54:/opt/kyuubi$ bin/kyuubi status
Warn: Not find kyuubi environment file /opt/kyuubi/conf/kyuubi-env.sh, using default ones...
Kyuubi is not running

kyuubi@kyuubi-5b6784d6b4-wpc54:/opt/kyuubi$ ls -la pid
total 12
drwxrwxr-x 1 root root 4096 Mar  8 11:21 .
drwxrwxr-x 1 root root 4096 Mar 14 19:23 .

bin/kyuubi status is needed for liveness and readiness probes, so I would highly appreciate alternative solution as well.

Affects Version(s)

1.7.0

Kyuubi Server Log Output

No response

Kyuubi Engine Log Output

No response

Kyuubi Server Configurations

No response

Kyuubi Engine Configurations

No response

Additional context

No response

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
  • No. I cannot submit a PR at this time.
@dnskr dnskr added kind:bug This is a clearly a bug priority:major labels Mar 14, 2023
@github-actions
Copy link

Hello @dnskr,
Thanks for finding the time to report the issue!
We really appreciate the community's efforts to improve Apache Kyuubi.

@pan3793
Copy link
Member

pan3793 commented Mar 17, 2023

@dnskr I noticed the same issue today, one idea is to use supervisord to manage kyuubi process, so that we can use supervisorctl status <name> to check health.

@pan3793
Copy link
Member

pan3793 commented Mar 26, 2023

@dnskr WDYT about the supervisord idea?

@dnskr
Copy link
Contributor Author

dnskr commented Mar 26, 2023

@pan3793 Unfortunately I'm not familiar with supervisord, so cannot say any relevant opinion.

I found that kyuubi status works as expected if kyuubi start is used instead of default kyuubi run.
I.e. if the following command in the helm chart is used

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .Release.Name }}
spec:
  template:
    spec:
      containers:
        - name: kyuubi-server
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          command: ["/bin/bash", "-c", "bin/kyuubi start && tail -f ${KYUUBI_LOG_DIR}/kyuubi-*.out"]

execution kyuubi status returns correct result for both pods

kyuubi@kyuubi-78df679b9c-fn7pg:/opt/kyuubi$ bin/kyuubi status
Warn: Not find kyuubi environment file /opt/kyuubi/conf/kyuubi-env.sh, using default ones...
Kyuubi is running (pid: 16)

kyuubi@kyuubi-78df679b9c-vgbr8:/opt/kyuubi$ bin/kyuubi status
Warn: Not find kyuubi environment file /opt/kyuubi/conf/kyuubi-env.sh, using default ones...
Kyuubi is running (pid: 16)

@pan3793
Copy link
Member

pan3793 commented Mar 26, 2023

... I'm not familiar with supervisord ...

basically, supervisord is similar w/ systemd, here is an example

https://github.com/trinodb/docker-images/tree/master/testing/cdh5.15-hive

command: ["/bin/bash", "-c", "bin/kyuubi start && tail -f ${KYUUBI_LOG_DIR}/kyuubi-*.out"]

it's kinda workaround.

@dnskr
Copy link
Contributor Author

dnskr commented Mar 26, 2023

Do you think it's a good idea to create a PR for the helm chart with the workaround above?
Not sure I can fix the issue properly.

@pan3793
Copy link
Member

pan3793 commented Mar 26, 2023

let me try supervisord approach(maybe in one or two days), and do evaluation then

@Madhukar98
Copy link

I had observed that PID for, kyuubi server launched in foreground was not stored. So, tried to address this in naive way. Could someone please review this?

Madhukar525722 pushed a commit to Madhukar525722/kyuubi that referenced this issue Oct 26, 2024
Madhukar98 added a commit to Madhukar98/kyuubi that referenced this issue Oct 26, 2024
Madhukar98 added a commit to Madhukar98/kyuubi that referenced this issue Oct 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug This is a clearly a bug priority:major
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants