Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java object read hangs reading file from persistent volume mounted throuogh minikube #3288

Closed
ejschoen opened this issue Oct 31, 2018 · 10 comments
Labels
area/mount cause/go9p-limitation Issues related to our go9p implementation kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. os/linux priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. r/2019q2 Issue was last reviewed 2019q2

Comments

@ejschoen
Copy link

Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Please provide the following details:

Environment:

Minikube version : v0.30.0

  • OS : Debian Stretch
  • VM Driver : virtualbox
  • ISO version : v0.30.0
  • Install tools:
  • Others: root@i2kcontext-796c856b5-rdpb8:/# java -version
    openjdk version "1.8.0_181"
    OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-2~deb9u1-b13)
    OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

Container is based on openjdk:8.

What happened:
This is strange. I have a Java app that, as part of initialization, reads a large data structure from a file that was previously serialized using Java Object I/O. The image is running in a container with the file presented to it through a volume mount. Outside of the container, I have a process running minikube mount, and the appropriate file system is correctly mounted via PersistentVolume and PersistentVolumeClaim. I can see the file in the container's file system, and it has the correct privileges and checksum. However, my Java app hangs while reading the data structure from the file. Here's the stack from jstack. As far as I can tell, it never progresses past the read0 call, although I can't determine if it's the first read0 call or another one. The process doesn't appear to be consuming any significant CPU.

"main" #1 prio=5 os_prio=0 tid=0x00007fb7d000a800 nid=0x7 runnable [0x00007fb7da
12c000]
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.read0(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:207)
at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java
:2641)
at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream
.java:2948)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputSt
ream.java:2958)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1738)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2
042)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
at java.util.TreeMap.buildFromSorted(TreeMap.java:2568)
at java.util.TreeMap.buildFromSorted(TreeMap.java:2508)
at java.util.TreeMap.readObject(TreeMap.java:2454)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)

Now, If I shell into the container, copy the file to /tmp, and load it from there, it loads fine in the container. So a) it's copyable, and b) the problem seems to be related to reading in the manner that this Java code reads from the volume mount through the minikube virtualbox mount point. If I build the same file into the image via Dockerfile, that file reads in fine.

What you expected to happen:
Input via Java ObjectInputStream would complete in a timely fashion.

How to reproduce it (as minimally and precisely as possible):
I think it would take some effort to package up the code to reproduce this. I'm not sure it could be made to happen easily with a simple input test, as I don't see any other cases of reading from mounted files hanging. However, I am happy to try to package up a repro if needed.

Output of minikube logs (if applicable):

Oct 31 13:42:30 minikube kubelet[2858]: I1031 13:42:30.942484 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "service-account-creds" (UniqueName: "kubernetes.io/secret/d0514fd3-dd12-11e8-9f35-080027b74f1f-service-account-creds") pod "i2kworkers-54bcbb8647-xd4qp" (UID: "d0514fd3-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:31 minikube kubelet[2858]: I1031 13:42:31.042715 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "i2kcfg-pv" (UniqueName: "kubernetes.io/host-path/d0514fd3-dd12-11e8-9f35-080027b74f1f-i2kcfg-pv") pod "i2kworkers-54bcbb8647-xd4qp" (UID: "d0514fd3-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:31 minikube kubelet[2858]: I1031 13:42:31.042790 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "default-token-86gvw" (UniqueName: "kubernetes.io/secret/d0514fd3-dd12-11e8-9f35-080027b74f1f-default-token-86gvw") pod "i2kworkers-54bcbb8647-xd4qp" (UID: "d0514fd3-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:31 minikube kubelet[2858]: I1031 13:42:31.749672 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "default-token-86gvw" (UniqueName: "kubernetes.io/secret/d04e02fb-dd12-11e8-9f35-080027b74f1f-default-token-86gvw") pod "i2kweb-5cbccd559c-fd4zl" (UID: "d04e02fb-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:31 minikube kubelet[2858]: I1031 13:42:31.749826 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "i2kcfg-pv" (UniqueName: "kubernetes.io/host-path/d04e02fb-dd12-11e8-9f35-080027b74f1f-i2kcfg-pv") pod "i2kweb-5cbccd559c-fd4zl" (UID: "d04e02fb-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:32 minikube kubelet[2858]: W1031 13:42:32.632280 2858 pod_container_deletor.go:77] Container "ab92edf3fd5862842236d176750c8825009f642eaa8480d5368d8e7d06e03bb7" not found in pod's containers
Oct 31 13:42:32 minikube kubelet[2858]: I1031 13:42:32.861477 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "default-token-86gvw" (UniqueName: "kubernetes.io/secret/d047a2bd-dd12-11e8-9f35-080027b74f1f-default-token-86gvw") pod "i2ksource-776bd7d5f9-6x5sn" (UID: "d047a2bd-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:32 minikube kubelet[2858]: I1031 13:42:32.861513 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "i2kcfg-pv" (UniqueName: "kubernetes.io/host-path/d047a2bd-dd12-11e8-9f35-080027b74f1f-i2kcfg-pv") pod "i2ksource-776bd7d5f9-6x5sn" (UID: "d047a2bd-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:32 minikube kubelet[2858]: W1031 13:42:32.940001 2858 container.go:393] Failed to create summary reader for "/system.slice/run-rfe08f2a43dc6418b983852ca5058dd32.scope": none of the resources are being tracked.
Oct 31 13:42:32 minikube kubelet[2858]: W1031 13:42:32.940193 2858 container.go:393] Failed to create summary reader for "/system.slice/run-r95351b62bccb46ddb51632403d1a3aa6.scope": none of the resources are being tracked.
Oct 31 13:42:32 minikube kubelet[2858]: W1031 13:42:32.940278 2858 container.go:393] Failed to create summary reader for "/system.slice/run-r65f5a980289346a5a723503cbaf36af6.scope": none of the resources are being tracked.
Oct 31 13:42:32 minikube kubelet[2858]: W1031 13:42:32.940355 2858 container.go:393] Failed to create summary reader for "/system.slice/run-r1777c7fc4a00428795281434ff9ba71e.scope": none of the resources are being tracked.
Oct 31 13:42:33 minikube kubelet[2858]: I1031 13:42:33.868570 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "default-token-86gvw" (UniqueName: "kubernetes.io/secret/d049e46d-dd12-11e8-9f35-080027b74f1f-default-token-86gvw") pod "i2kconduit-db-76ffdd776c-wfstq" (UID: "d049e46d-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:33 minikube kubelet[2858]: I1031 13:42:33.868638 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "i2kcfg-pv" (UniqueName: "kubernetes.io/host-path/d049e46d-dd12-11e8-9f35-080027b74f1f-i2kcfg-pv") pod "i2kconduit-db-76ffdd776c-wfstq" (UID: "d049e46d-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:33 minikube kubelet[2858]: W1031 13:42:33.918736 2858 pod_container_deletor.go:77] Container "4a8037a08c45e89b2cdc0f6664394ed00edf3453a5bc64f47baddf3f727bc8e2" not found in pod's containers
Oct 31 13:42:34 minikube kubelet[2858]: I1031 13:42:34.878400 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "default-token-86gvw" (UniqueName: "kubernetes.io/secret/d05ccc32-dd12-11e8-9f35-080027b74f1f-default-token-86gvw") pod "i2kcontext-796c856b5-rdpb8" (UID: "d05ccc32-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:34 minikube kubelet[2858]: I1031 13:42:34.878424 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "i2kcfg-pv" (UniqueName: "kubernetes.io/host-path/d05ccc32-dd12-11e8-9f35-080027b74f1f-i2kcfg-pv") pod "i2kcontext-796c856b5-rdpb8" (UID: "d05ccc32-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:34 minikube kubelet[2858]: W1031 13:42:34.992057 2858 container.go:393] Failed to create summary reader for "/system.slice/run-re6f56c27c2de46498e7e5ce72a756bc1.scope": none of the resources are being tracked.
Oct 31 13:42:35 minikube kubelet[2858]: I1031 13:42:35.886929 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-d04134fc-dd12-11e8-9f35-080027b74f1f" (UniqueName: "kubernetes.io/host-path/d047582c-dd12-11e8-9f35-080027b74f1f-pvc-d04134fc-dd12-11e8-9f35-080027b74f1f") pod "i2kmariadb-56978b4c77-dmf8q" (UID: "d047582c-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:36 minikube kubelet[2858]: I1031 13:42:36.690085 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "i2kcfg-pv" (UniqueName: "kubernetes.io/host-path/d047582c-dd12-11e8-9f35-080027b74f1f-i2kcfg-pv") pod "i2kmariadb-56978b4c77-dmf8q" (UID: "d047582c-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:36 minikube kubelet[2858]: I1031 13:42:36.690170 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "default-token-86gvw" (UniqueName: "kubernetes.io/secret/d047582c-dd12-11e8-9f35-080027b74f1f-default-token-86gvw") pod "i2kmariadb-56978b4c77-dmf8q" (UID: "d047582c-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:39 minikube kubelet[2858]: I1031 13:42:39.101731 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-d042b186-dd12-11e8-9f35-080027b74f1f" (UniqueName: "kubernetes.io/host-path/d0475855-dd12-11e8-9f35-080027b74f1f-pvc-d042b186-dd12-11e8-9f35-080027b74f1f") pod "i2ksearch-849fcdcf78-rxzp8" (UID: "d0475855-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:39 minikube kubelet[2858]: I1031 13:42:39.904880 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "default-token-86gvw" (UniqueName: "kubernetes.io/secret/d0475855-dd12-11e8-9f35-080027b74f1f-default-token-86gvw") pod "i2ksearch-849fcdcf78-rxzp8" (UID: "d0475855-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:42:39 minikube kubelet[2858]: I1031 13:42:39.904915 2858 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "i2kcfg-pv" (UniqueName: "kubernetes.io/host-path/d0475855-dd12-11e8-9f35-080027b74f1f-i2kcfg-pv") pod "i2ksearch-849fcdcf78-rxzp8" (UID: "d0475855-dd12-11e8-9f35-080027b74f1f")
Oct 31 13:56:21 minikube kubelet[2858]: I1031 13:56:21.769608 2858 kuberuntime_manager.go:513] Container {Name:kubernetes-dashboard Image:k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.0 Command:[] Args:[] WorkingDir: Ports:[{Name: HostPort:0 ContainerPort:9090 Protocol:TCP HostIP:}] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[{Name:default-token-krks7 ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath: MountPropagation:}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/,Port:9090,Host:,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:30,TimeoutSeconds:30,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:3,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Oct 31 13:56:21 minikube kubelet[2858]: I1031 13:56:21.769882 2858 kuberuntime_manager.go:757] checking backoff for container "kubernetes-dashboard" in pod "kubernetes-dashboard-6f4cfc5d87-kj8fr_kube-system(b9c41328-dcbf-11e8-9f35-080027b74f1f)"

Anything else do we need to know:

@tstromberg tstromberg added co/virtualbox os/linux kind/bug Categorizes issue or PR as related to a bug. area/mount labels Nov 6, 2018
@tstromberg tstromberg added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Jan 23, 2019
@tstromberg
Copy link
Contributor

tstromberg commented Jan 23, 2019

This bug sounds unfortunate. Do you mind posting some step-by-step instructions on how to replicate the issue?

@tstromberg tstromberg added the triage/needs-information Indicates an issue needs more information in order to work on it. label Jan 23, 2019
@ejschoen
Copy link
Author

I think I'd have to try to create a repro case that's independent of our application. I'll put it on my list to do. Since October, I've moved over to full kubelet 1.13.1 to support a 2-node cluster, and don't have any issues reading this file through kubelet's local volume mount.

@ejschoen
Copy link
Author

ejschoen commented Jan 24, 2019

I got around to creating something of an isolated example. This still happens with minikube 0.33.1. To reproduce, unpack the attached tar file and follow the README instructions. (Deploy minikube and start it with the VB VM driver, build a Docker image, run minikube mount, deploy the Kubenetes configuration, and watch the java process in the pod's container hang.) . You can see the hang by running jstack on the java process in the minikubehang container.

minikubehang.tar.gz

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 29, 2019
@tstromberg
Copy link
Contributor

This will likely require #4325

@tstromberg tstromberg added r/2019q2 Issue was last reviewed 2019q2 and removed co/virtualbox lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 22, 2019
@tstromberg tstromberg removed the triage/needs-information Indicates an issue needs more information in order to work on it. label Jul 17, 2019
@tstromberg tstromberg added the cause/go9p-limitation Issues related to our go9p implementation label Sep 20, 2019
@medyagh
Copy link
Member

medyagh commented Dec 16, 2019

this issue seems to be still exisiting

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 15, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 15, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/mount cause/go9p-limitation Issues related to our go9p implementation kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. os/linux priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. r/2019q2 Issue was last reviewed 2019q2
Projects
None yet
Development

No branches or pull requests

5 participants