-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash boot loop because of "Transport endpoint is not connected", umount -l #10
Comments
I already unmount in a "preStop" hook. Did your deployment contians this step as well?
|
Yes, it looks like this here:
BTW, I use a sub folder, because it means that you don't loose transport connection in the pods when you re-mount. (Because they get the sub-folder changed, and not the root that they are mapping to). I guess it should try something like
I am not sure if that syntax works. I can try test unless you have a better suggestion. |
Found someone getting this on stackoverflow too: |
Ok, I think I solved it. apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app: s3-provider
name: s3-provider
spec:
selector:
matchLabels:
app: s3-provider
template:
metadata:
labels:
app: s3-provider
spec:
initContainers:
- name: init-myservice
image: bash
command: ['bash', '-c', 'umount -l /mnt/data-s3-fs/root ; true']
securityContext:
privileged: true
capabilities:
add:
- SYS_ADMIN
# use ALL entries in the config map as environment variables
envFrom:
- configMapRef:
name: s3-config
volumeMounts:
- name: devfuse
mountPath: /dev/fuse
- name: mntdatas3fs-init
mountPath: /mnt:shared
containers:
- name: s3fuse
image: 963341077747.dkr.ecr.us-east-1.amazonaws.com/kube-s3:1.0
imagePullPolicy: Always
lifecycle:
preStop:
exec:
command: ["bash", "-c", "umount -f /srv/s3-mount/root"]
securityContext:
privileged: true
capabilities:
add:
- SYS_ADMIN
# use ALL entries in the config map as environment variables
envFrom:
- configMapRef:
name: s3-config
env:
- name: S3_BUCKET
value: s3-mount
- name: MNT_POINT
value: /srv/s3-mount/root
- name: IAM_ROLE
value: none
volumeMounts:
- name: devfuse
mountPath: /dev/fuse
- name: mntdatas3fs
mountPath: /srv/s3-mount/root:shared
volumes:
- name: devfuse
hostPath:
path: /dev/fuse
- name: mntdatas3fs
hostPath:
type: DirectoryOrCreate
path: /mnt/data-s3-fs/root
- name: mntdatas3fs-init
hostPath:
type: DirectoryOrCreate
path: /mnt |
Usually means that s3fs exited unexpectedly. I would check to see if the process is running. If not it would help to gather the logs or attach gdb before the crash to get a backtrace. |
I saw no logs. both in describe and in the logs. It might be a GDB traceable issue, though since I found no way to reproduce this without just waiting for it to happen. I am not sure. Its also hard to hold this kind of process under tracing just before it crashes, because I don't know how to cause it. |
I think most crashes are caused by the resource competition like CPU or memory lack. Something like below should be add to the yaml: |
Hey,
Sometimes I get an issue that pots the pod on
CrashLoopBackOff
.The workaround is to ssh to the node and run
unmount -l
(lazy), then deleted the pod and let it get created.During that time the mount is down.
Debuging results:
Transport endpoint is not connected
The text was updated successfully, but these errors were encountered: