Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

vsock socket sporadically leaked #618

Closed
maximilianriemensberger opened this issue Aug 21, 2018 · 22 comments
Closed

vsock socket sporadically leaked #618

maximilianriemensberger opened this issue Aug 21, 2018 · 22 comments
Assignees

Comments

@maximilianriemensberger

Description of problem

vsock sockets are leaked. I'm not entirely sure whether that's a kata issue or a kernel issue. Probably kernel but you are more familiar with the innards of vsocks and kvm than myself. My config file has use_vock=true. Other than that it's all standard.

➜ hix ~ % for ((i=1; i<=100; i++)); do echo "# Run $(printf "%3d\n" $i)"; docker run --runtime kata-vsock -it --rm ubuntu bash -c 'true'; sleep 2; lsmod | grep ^vhost_vsock; ss -ip --vsock; done
# Run   1
vhost_vsock            20480  0
Netid             State              Recv-Q              Send-Q                            Local Address:Port                            Peer Address:Port              
# Run   2
vhost_vsock            20480  1
Netid             State              Recv-Q              Send-Q                            Local Address:Port                            Peer Address:Port              
v_str             ESTAB              0                   0                                             2:20951                               11280363:1024              
# Run   3
vhost_vsock            20480  1
Netid             State              Recv-Q              Send-Q                            Local Address:Port                            Peer Address:Port              
v_str             ESTAB              0                   0                                             2:20951                               11280363:1024              
# Run   4
vhost_vsock            20480  1
Netid             State              Recv-Q              Send-Q                            Local Address:Port                            Peer Address:Port              
v_str             ESTAB              0                   0                                             2:20951                               11280363:1024              
# Run   5
vhost_vsock            20480  2
Netid             State              Recv-Q              Send-Q                            Local Address:Port                            Peer Address:Port              
v_str             ESTAB              0                   0                                             2:55515                              695080683:1024              
v_str             ESTAB              0                   0                                             2:20951                               11280363:1024              
# Run   6
vhost_vsock            20480  2
Netid             State              Recv-Q              Send-Q                            Local Address:Port                            Peer Address:Port              
v_str             ESTAB              0                   0                                             2:55515                              695080683:1024              
v_str             ESTAB              0                   0                                             2:20951                               11280363:1024              
# Run   7
vhost_vsock            20480  2
Netid             State              Recv-Q              Send-Q                            Local Address:Port                            Peer Address:Port              
v_str             ESTAB              0                   0                                             2:55515                              695080683:1024              
v_str             ESTAB              0                   0                                             2:20951                               11280363:1024              
# Run   8
vhost_vsock            20480  3
Netid             State              Recv-Q              Send-Q                            Local Address:Port                            Peer Address:Port              
v_str             ESTAB              0                   0                                             2:55515                              695080683:1024              
v_str             ESTAB              0                   0                                             2:20951                               11280363:1024              
v_str             ESTAB              0                   0                                             2:75208                             4249917547:1024              
# Run   9
vhost_vsock            20480  3
Netid             State              Recv-Q              Send-Q                            Local Address:Port                            Peer Address:Port              
v_str             ESTAB              0                   0                                             2:55515                              695080683:1024              
v_str             ESTAB              0                   0                                             2:20951                               11280363:1024              
v_str             ESTAB              0                   0                                             2:75208                             4249917547:1024              
# Run  10
vhost_vsock            20480  3
Netid             State              Recv-Q              Send-Q                            Local Address:Port                            Peer Address:Port              
v_str             ESTAB              0                   0                                             2:55515                              695080683:1024              
v_str             ESTAB              0                   0                                             2:20951                               11280363:1024              
v_str    

         ESTAB              0                   0                                             2:75208                             4249917547:1024         

...

Meta details

Running kata-collect-data.sh version 1.2.0 (commit 0bcb32f) at 2018-08-21.15:00:29.959157415+0200.

(With some fixups to choose the correct config file)


Runtime is /usr/bin/kata-runtime.

kata-env

Output of "/usr/bin/kata-runtime --kata-config /etc/kata-containers/configuration-vsock.toml kata-env":

[Meta]
  Version = "1.0.13"

[Runtime]
  Debug = false
  [Runtime.Version]
    Semver = "1.2.0"
    Commit = "0bcb32f"
    OCI = "1.0.1"
  [Runtime.Config]
    Path = "/etc/kata-containers/configuration-vsock.toml"

[Hypervisor]
  MachineType = "pc"
  Version = "QEMU emulator version 2.11.0\nCopyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers"
  Path = "/usr/bin/qemu-lite-system-x86_64"
  BlockDeviceDriver = "virtio-scsi"
  Msize9p = 8192
  Debug = false
  UseVSock = true

[Image]
  Path = "/usr/share/kata-containers/kata-containers-image_clearlinux_1.2.0_agent_fcfa054a757.img"

[Kernel]
  Path = "/usr/share/kata-containers/vmlinuz-4.14.51.7-134.container"
  Parameters = ""

[Initrd]
  Path = ""

[Proxy]
  Type = "noProxy"
  Version = ""
  Path = ""
  Debug = false

[Shim]
  Type = "kataShim"
  Version = "kata-shim version 1.2.0-0a37760"
  Path = "/usr/libexec/kata-containers/kata-shim"
  Debug = false

[Agent]
  Type = "kata"

[Host]
  Kernel = "4.15.0-1015-oem"
  Architecture = "amd64"
  VMContainerCapable = true
  [Host.Distro]
    Name = "Ubuntu"
    Version = "18.04"
  [Host.CPU]
    Vendor = "GenuineIntel"
    Model = "Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz"

Runtime config files

Runtime config file contents

Output of "cat "/etc/kata-containers/configuration-vsock.toml"":

# Copyright (c) 2017-2018 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#

# XXX: WARNING: this file is auto-generated.
# XXX:
# XXX: Source file: "cli/config/configuration.toml.in"
# XXX: Project:
# XXX:   Name: Kata Containers
# XXX:   Type: kata

[hypervisor.qemu]
path = "/usr/bin/qemu-lite-system-x86_64"
kernel = "/usr/share/kata-containers/vmlinuz.container"
image = "/usr/share/kata-containers/kata-containers.img"
machine_type = "pc"

# Optional space-separated list of options to pass to the guest kernel.
# For example, use `kernel_params = "vsyscall=emulate"` if you are having
# trouble running pre-2.15 glibc.
#
# WARNING: - any parameter specified here will take priority over the default
# parameter value of the same name used to start the virtual machine.
# Do not set values here unless you understand the impact of doing so as you
# may stop the virtual machine from booting.
# To see the list of default parameters, enable hypervisor debug, create a
# container and look for 'default-kernel-parameters' log entries.
kernel_params = ""

# Path to the firmware.
# If you want that qemu uses the default firmware leave this option empty
firmware = ""

# Machine accelerators
# comma-separated list of machine accelerators to pass to the hypervisor.
# For example, `machine_accelerators = "nosmm,nosmbus,nosata,nopit,static-prt,nofw"`
machine_accelerators=""

# Default number of vCPUs per SB/VM:
# unspecified or 0                --> will be set to 1
# < 0                             --> will be set to the actual number of physical cores
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores      --> will be set to the actual number of physical cores
default_vcpus = 1

# Default maximum number of vCPUs per SB/VM:
# unspecified or == 0             --> will be set to the actual number of physical cores or to the maximum number
#                                     of vCPUs supported by KVM if that number is exceeded
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores      --> will be set to the actual number of physical cores or to the maximum number
#                                     of vCPUs supported by KVM if that number is exceeded
# WARNING: Depending of the architecture, the maximum number of vCPUs supported by KVM is used when
# the actual number of physical cores is greater than it.
# WARNING: Be aware that this value impacts the virtual machine's memory footprint and CPU
# the hotplug functionality. For example, `default_maxvcpus = 240` specifies that until 240 vCPUs
# can be added to a SB/VM, but the memory footprint will be big. Another example, with
# `default_maxvcpus = 8` the memory footprint will be small, but 8 will be the maximum number of
# vCPUs supported by the SB/VM. In general, we recommend that you do not edit this variable,
# unless you know what are you doing.
default_maxvcpus = 0

# Bridges can be used to hot plug devices.
# Limitations:
# * Currently only pci bridges are supported
# * Until 30 devices per bridge can be hot plugged.
# * Until 5 PCI bridges can be cold plugged per VM.
#   This limitation could be a bug in qemu or in the kernel
# Default number of bridges per SB/VM:
# unspecified or 0   --> will be set to 1
# > 1 <= 5           --> will be set to the specified number
# > 5                --> will be set to 5
default_bridges = 1

# Default memory size in MiB for SB/VM.
# If unspecified then it will be set 2048 MiB.
#default_memory = 2048

# Disable block device from being used for a container's rootfs.
# In case of a storage driver like devicemapper where a container's 
# root file system is backed by a block device, the block device is passed
# directly to the hypervisor for performance reasons. 
# This flag prevents the block device from being passed to the hypervisor, 
# 9pfs is used instead to pass the rootfs.
disable_block_device_use = false

# Block storage driver to be used for the hypervisor in case the container
# rootfs is backed by a block device. This is either virtio-scsi or 
# virtio-blk.
block_device_driver = "virtio-scsi"

# Enable iothreads (data-plane) to be used. This causes IO to be
# handled in a separate IO thread. This is currently only implemented
# for SCSI.
#
enable_iothreads = false

# Enable pre allocation of VM RAM, default false
# Enabling this will result in lower container density
# as all of the memory will be allocated and locked
# This is useful when you want to reserve all the memory
# upfront or in the cases where you want memory latencies
# to be very predictable
# Default false
#enable_mem_prealloc = true

# Enable huge pages for VM RAM, default false
# Enabling this will result in the VM memory
# being allocated using huge pages.
# This is useful when you want to use vhost-user network
# stacks within the container. This will automatically 
# result in memory pre allocation
#enable_hugepages = true

# Enable swap of vm memory. Default false.
# The behaviour is undefined if mem_prealloc is also set to true
#enable_swap = true

# This option changes the default hypervisor and kernel parameters
# to enable debug output where available. This extra output is added
# to the proxy logs, but only when proxy debug is also enabled.
# 
# Default false
#enable_debug = true

# Disable the customizations done in the runtime when it detects
# that it is running on top a VMM. This will result in the runtime
# behaving as it would when running on bare metal.
# 
#disable_nesting_checks = true

# This is the msize used for 9p shares. It is the number of bytes 
# used for 9p packet payload.
#msize_9p = 8192

# If true and vsocks are supported, use vsocks to communicate directly
# with the agent and no proxy is started, otherwise use unix
# sockets and start a proxy to communicate with the agent.
# Default false
use_vsock = true

[factory]
# VM templating support. Once enabled, new VMs are created from template
# using vm cloning. They will share the same initial kernel, initramfs and
# agent memory by mapping it readonly. It helps speeding up new container
# creation and saves a lot of memory if there are many kata containers running
# on the same host.
#
# When disabled, new VMs are created from scratch.
#
# Default false
#enable_template = true

[proxy.kata]
path = "/usr/libexec/kata-containers/kata-proxy"

# If enabled, proxy messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[shim.kata]
path = "/usr/libexec/kata-containers/kata-shim"

# If enabled, shim messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[agent.kata]
# There is no field for this section. The goal is only to be able to
# specify which type of agent the user wants to use.

[runtime]
# If enabled, the runtime will log additional debug messages to the
# system log
# (default: disabled)
#enable_debug = true
#
# Internetworking model
# Determines how the VM should be connected to the
# the container network interface
# Options:
#
#   - bridged
#     Uses a linux bridge to interconnect the container interface to
#     the VM. Works for most cases except macvlan and ipvlan.
#
#   - macvtap
#     Used when the Container network interface can be bridged using
#     macvtap.
internetworking_model="macvtap"

Image details

---
osbuilder:
  url: "https://github.com/kata-containers/osbuilder"
  version: "unknown"
rootfs-creation-time: "2018-08-13T22:51:39.765008919+0000Z"
description: "osbuilder rootfs"
file-format-version: "0.0.2"
architecture: "x86_64"
base-distro:
  name: "Clear"
  version: "24400"
  packages:
    default:
      - "iptables-bin"
      - "libudev0-shim"
      - "systemd"
    extra:

agent:
  url: "https://github.com/kata-containers/agent"
  name: "kata-agent"
  version: "1.2.0-fcfa054a757e7c17afba47b0b4d7e91cbb8688ed"
  agent-is-init-daemon: "no"

Initrd details

No initrd


Logfiles

Runtime logs

Recent runtime problems found in system journal:

time="2018-08-21T14:57:34.527001564+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=72f19896e0dbd4418218bd5a4ca71b1eb6c3b40a21b4deb89e0b5f276a2f6a88 name=kata-runtime pid=23845 sandbox=72f19896e0dbd4418218bd5a4ca71b1eb6c3b40a21b4deb89e0b5f276a2f6a88 source=runtime
time="2018-08-21T14:57:34.573026898+02:00" level=error msg="Container 72f19896e0dbd4418218bd5a4ca71b1eb6c3b40a21b4deb89e0b5f276a2f6a88 not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=72f19896e0dbd4418218bd5a4ca71b1eb6c3b40a21b4deb89e0b5f276a2f6a88 name=kata-runtime pid=23882 sandbox=72f19896e0dbd4418218bd5a4ca71b1eb6c3b40a21b4deb89e0b5f276a2f6a88 source=runtime
time="2018-08-21T14:57:37.114783397+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=10eff59c292cec4b4fcc50abacc2d07e9ab95f9c3348cb92f7362cf0dbc8f109 error="open /run/vc/sbs/10eff59c292cec4b4fcc50abacc2d07e9ab95f9c3348cb92f7362cf0dbc8f109/devices.json: no such file or directory" name=kata-runtime pid=23999 sandbox=10eff59c292cec4b4fcc50abacc2d07e9ab95f9c3348cb92f7362cf0dbc8f109 sandboxid=10eff59c292cec4b4fcc50abacc2d07e9ab95f9c3348cb92f7362cf0dbc8f109 source=virtcontainers subsystem=sandbox
time="2018-08-21T14:57:39.645379721+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=10eff59c292cec4b4fcc50abacc2d07e9ab95f9c3348cb92f7362cf0dbc8f109 name=kata-runtime pid=24113 sandbox=10eff59c292cec4b4fcc50abacc2d07e9ab95f9c3348cb92f7362cf0dbc8f109 source=runtime
time="2018-08-21T14:57:39.691859992+02:00" level=error msg="Container 10eff59c292cec4b4fcc50abacc2d07e9ab95f9c3348cb92f7362cf0dbc8f109 not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=10eff59c292cec4b4fcc50abacc2d07e9ab95f9c3348cb92f7362cf0dbc8f109 name=kata-runtime pid=24152 sandbox=10eff59c292cec4b4fcc50abacc2d07e9ab95f9c3348cb92f7362cf0dbc8f109 source=runtime
time="2018-08-21T14:57:42.32276399+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=b28e0fd870f98a0f37d5134ac06d3b11fadd0d32acdbf3057f025731e1a8c82c error="open /run/vc/sbs/b28e0fd870f98a0f37d5134ac06d3b11fadd0d32acdbf3057f025731e1a8c82c/devices.json: no such file or directory" name=kata-runtime pid=24241 sandbox=b28e0fd870f98a0f37d5134ac06d3b11fadd0d32acdbf3057f025731e1a8c82c sandboxid=b28e0fd870f98a0f37d5134ac06d3b11fadd0d32acdbf3057f025731e1a8c82c source=virtcontainers subsystem=sandbox
time="2018-08-21T14:57:44.829837857+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=b28e0fd870f98a0f37d5134ac06d3b11fadd0d32acdbf3057f025731e1a8c82c name=kata-runtime pid=24356 sandbox=b28e0fd870f98a0f37d5134ac06d3b11fadd0d32acdbf3057f025731e1a8c82c source=runtime
time="2018-08-21T14:57:44.863787296+02:00" level=error msg="Container b28e0fd870f98a0f37d5134ac06d3b11fadd0d32acdbf3057f025731e1a8c82c not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=b28e0fd870f98a0f37d5134ac06d3b11fadd0d32acdbf3057f025731e1a8c82c name=kata-runtime pid=24394 sandbox=b28e0fd870f98a0f37d5134ac06d3b11fadd0d32acdbf3057f025731e1a8c82c source=runtime
time="2018-08-21T14:57:47.458853375+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=34b90e9f01bc952a6649e8b5715ad3c64f23184c7d277b01c607e91164b971fc error="open /run/vc/sbs/34b90e9f01bc952a6649e8b5715ad3c64f23184c7d277b01c607e91164b971fc/devices.json: no such file or directory" name=kata-runtime pid=24481 sandbox=34b90e9f01bc952a6649e8b5715ad3c64f23184c7d277b01c607e91164b971fc sandboxid=34b90e9f01bc952a6649e8b5715ad3c64f23184c7d277b01c607e91164b971fc source=virtcontainers subsystem=sandbox
time="2018-08-21T14:57:49.985351842+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=34b90e9f01bc952a6649e8b5715ad3c64f23184c7d277b01c607e91164b971fc name=kata-runtime pid=24595 sandbox=34b90e9f01bc952a6649e8b5715ad3c64f23184c7d277b01c607e91164b971fc source=runtime
time="2018-08-21T14:57:50.022208526+02:00" level=error msg="Container 34b90e9f01bc952a6649e8b5715ad3c64f23184c7d277b01c607e91164b971fc not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=34b90e9f01bc952a6649e8b5715ad3c64f23184c7d277b01c607e91164b971fc name=kata-runtime pid=24632 sandbox=34b90e9f01bc952a6649e8b5715ad3c64f23184c7d277b01c607e91164b971fc source=runtime
time="2018-08-21T14:57:52.726713186+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=7f2b7c0ba747701614b4c1f3e908d0c5df6c7621c606af74f744e47ff75e3ff4 error="open /run/vc/sbs/7f2b7c0ba747701614b4c1f3e908d0c5df6c7621c606af74f744e47ff75e3ff4/devices.json: no such file or directory" name=kata-runtime pid=24724 sandbox=7f2b7c0ba747701614b4c1f3e908d0c5df6c7621c606af74f744e47ff75e3ff4 sandboxid=7f2b7c0ba747701614b4c1f3e908d0c5df6c7621c606af74f744e47ff75e3ff4 source=virtcontainers subsystem=sandbox
time="2018-08-21T14:57:54.193797709+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=7f2b7c0ba747701614b4c1f3e908d0c5df6c7621c606af74f744e47ff75e3ff4 name=kata-runtime pid=24839 sandbox=7f2b7c0ba747701614b4c1f3e908d0c5df6c7621c606af74f744e47ff75e3ff4 source=runtime
time="2018-08-21T14:57:54.227141887+02:00" level=error msg="Container 7f2b7c0ba747701614b4c1f3e908d0c5df6c7621c606af74f744e47ff75e3ff4 not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=7f2b7c0ba747701614b4c1f3e908d0c5df6c7621c606af74f744e47ff75e3ff4 name=kata-runtime pid=24878 sandbox=7f2b7c0ba747701614b4c1f3e908d0c5df6c7621c606af74f744e47ff75e3ff4 source=runtime
time="2018-08-21T14:57:56.950613096+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=0ead2bffa8d3f9f45d3dddd6657c78af0e459e4aa886d5091aa0d0a259ac49d2 error="open /run/vc/sbs/0ead2bffa8d3f9f45d3dddd6657c78af0e459e4aa886d5091aa0d0a259ac49d2/devices.json: no such file or directory" name=kata-runtime pid=24961 sandbox=0ead2bffa8d3f9f45d3dddd6657c78af0e459e4aa886d5091aa0d0a259ac49d2 sandboxid=0ead2bffa8d3f9f45d3dddd6657c78af0e459e4aa886d5091aa0d0a259ac49d2 source=virtcontainers subsystem=sandbox
time="2018-08-21T14:57:59.557406468+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=0ead2bffa8d3f9f45d3dddd6657c78af0e459e4aa886d5091aa0d0a259ac49d2 name=kata-runtime pid=25072 sandbox=0ead2bffa8d3f9f45d3dddd6657c78af0e459e4aa886d5091aa0d0a259ac49d2 source=runtime
time="2018-08-21T14:57:59.591942265+02:00" level=error msg="Container 0ead2bffa8d3f9f45d3dddd6657c78af0e459e4aa886d5091aa0d0a259ac49d2 not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=0ead2bffa8d3f9f45d3dddd6657c78af0e459e4aa886d5091aa0d0a259ac49d2 name=kata-runtime pid=25111 sandbox=0ead2bffa8d3f9f45d3dddd6657c78af0e459e4aa886d5091aa0d0a259ac49d2 source=runtime
time="2018-08-21T14:58:02.278854588+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=9817abb3464c6f1c4d5463ac60b74ce06b1a336e6fdd127930abbdd7f9cee268 error="open /run/vc/sbs/9817abb3464c6f1c4d5463ac60b74ce06b1a336e6fdd127930abbdd7f9cee268/devices.json: no such file or directory" name=kata-runtime pid=25196 sandbox=9817abb3464c6f1c4d5463ac60b74ce06b1a336e6fdd127930abbdd7f9cee268 sandboxid=9817abb3464c6f1c4d5463ac60b74ce06b1a336e6fdd127930abbdd7f9cee268 source=virtcontainers subsystem=sandbox
time="2018-08-21T14:58:03.685334539+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=9817abb3464c6f1c4d5463ac60b74ce06b1a336e6fdd127930abbdd7f9cee268 name=kata-runtime pid=25308 sandbox=9817abb3464c6f1c4d5463ac60b74ce06b1a336e6fdd127930abbdd7f9cee268 source=runtime
time="2018-08-21T14:58:03.731509401+02:00" level=error msg="Container 9817abb3464c6f1c4d5463ac60b74ce06b1a336e6fdd127930abbdd7f9cee268 not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=9817abb3464c6f1c4d5463ac60b74ce06b1a336e6fdd127930abbdd7f9cee268 name=kata-runtime pid=25346 sandbox=9817abb3464c6f1c4d5463ac60b74ce06b1a336e6fdd127930abbdd7f9cee268 source=runtime
time="2018-08-21T14:58:06.314574169+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=5d7f6f603ff2360460efedce992915acd110a92955a62a7ccfe5836e2b7b2e9c error="open /run/vc/sbs/5d7f6f603ff2360460efedce992915acd110a92955a62a7ccfe5836e2b7b2e9c/devices.json: no such file or directory" name=kata-runtime pid=25428 sandbox=5d7f6f603ff2360460efedce992915acd110a92955a62a7ccfe5836e2b7b2e9c sandboxid=5d7f6f603ff2360460efedce992915acd110a92955a62a7ccfe5836e2b7b2e9c source=virtcontainers subsystem=sandbox
time="2018-08-21T14:58:08.118476121+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=5d7f6f603ff2360460efedce992915acd110a92955a62a7ccfe5836e2b7b2e9c name=kata-runtime pid=25545 sandbox=5d7f6f603ff2360460efedce992915acd110a92955a62a7ccfe5836e2b7b2e9c source=runtime
time="2018-08-21T14:58:08.168051579+02:00" level=error msg="Container 5d7f6f603ff2360460efedce992915acd110a92955a62a7ccfe5836e2b7b2e9c not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=5d7f6f603ff2360460efedce992915acd110a92955a62a7ccfe5836e2b7b2e9c name=kata-runtime pid=25582 sandbox=5d7f6f603ff2360460efedce992915acd110a92955a62a7ccfe5836e2b7b2e9c source=runtime
time="2018-08-21T14:58:10.858573951+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=8fb4d8b693c9c3991d33d6cd10cd207c536075eb9ee2b59c4f674380a4fbb16d error="open /run/vc/sbs/8fb4d8b693c9c3991d33d6cd10cd207c536075eb9ee2b59c4f674380a4fbb16d/devices.json: no such file or directory" name=kata-runtime pid=25670 sandbox=8fb4d8b693c9c3991d33d6cd10cd207c536075eb9ee2b59c4f674380a4fbb16d sandboxid=8fb4d8b693c9c3991d33d6cd10cd207c536075eb9ee2b59c4f674380a4fbb16d source=virtcontainers subsystem=sandbox
time="2018-08-21T14:58:12.205529959+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=8fb4d8b693c9c3991d33d6cd10cd207c536075eb9ee2b59c4f674380a4fbb16d name=kata-runtime pid=25778 sandbox=8fb4d8b693c9c3991d33d6cd10cd207c536075eb9ee2b59c4f674380a4fbb16d source=runtime
time="2018-08-21T14:58:12.248012706+02:00" level=error msg="Container 8fb4d8b693c9c3991d33d6cd10cd207c536075eb9ee2b59c4f674380a4fbb16d not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=8fb4d8b693c9c3991d33d6cd10cd207c536075eb9ee2b59c4f674380a4fbb16d name=kata-runtime pid=25815 sandbox=8fb4d8b693c9c3991d33d6cd10cd207c536075eb9ee2b59c4f674380a4fbb16d source=runtime
time="2018-08-21T14:58:14.92676998+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=29d9edee38937ed00135188d61fd3b210bbbe2e38b985d33974ed08fca3d4f97 error="open /run/vc/sbs/29d9edee38937ed00135188d61fd3b210bbbe2e38b985d33974ed08fca3d4f97/devices.json: no such file or directory" name=kata-runtime pid=25899 sandbox=29d9edee38937ed00135188d61fd3b210bbbe2e38b985d33974ed08fca3d4f97 sandboxid=29d9edee38937ed00135188d61fd3b210bbbe2e38b985d33974ed08fca3d4f97 source=virtcontainers subsystem=sandbox
time="2018-08-21T14:58:16.304854395+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=29d9edee38937ed00135188d61fd3b210bbbe2e38b985d33974ed08fca3d4f97 name=kata-runtime pid=26018 sandbox=29d9edee38937ed00135188d61fd3b210bbbe2e38b985d33974ed08fca3d4f97 source=runtime
time="2018-08-21T14:58:16.334682985+02:00" level=error msg="Container 29d9edee38937ed00135188d61fd3b210bbbe2e38b985d33974ed08fca3d4f97 not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=29d9edee38937ed00135188d61fd3b210bbbe2e38b985d33974ed08fca3d4f97 name=kata-runtime pid=26055 sandbox=29d9edee38937ed00135188d61fd3b210bbbe2e38b985d33974ed08fca3d4f97 source=runtime
time="2018-08-21T14:58:18.927101017+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=f4578eab0ec43b1562ab82e97e49d7fcdf8b8de9810f726dcb1fa2a65616fdbf error="open /run/vc/sbs/f4578eab0ec43b1562ab82e97e49d7fcdf8b8de9810f726dcb1fa2a65616fdbf/devices.json: no such file or directory" name=kata-runtime pid=26138 sandbox=f4578eab0ec43b1562ab82e97e49d7fcdf8b8de9810f726dcb1fa2a65616fdbf sandboxid=f4578eab0ec43b1562ab82e97e49d7fcdf8b8de9810f726dcb1fa2a65616fdbf source=virtcontainers subsystem=sandbox
time="2018-08-21T14:58:21.485139138+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=f4578eab0ec43b1562ab82e97e49d7fcdf8b8de9810f726dcb1fa2a65616fdbf name=kata-runtime pid=26250 sandbox=f4578eab0ec43b1562ab82e97e49d7fcdf8b8de9810f726dcb1fa2a65616fdbf source=runtime
time="2018-08-21T14:58:21.519288907+02:00" level=error msg="Container f4578eab0ec43b1562ab82e97e49d7fcdf8b8de9810f726dcb1fa2a65616fdbf not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=f4578eab0ec43b1562ab82e97e49d7fcdf8b8de9810f726dcb1fa2a65616fdbf name=kata-runtime pid=26287 sandbox=f4578eab0ec43b1562ab82e97e49d7fcdf8b8de9810f726dcb1fa2a65616fdbf source=runtime
time="2018-08-21T14:58:24.126531536+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=e9ae7c4acd26400ce36d98384cd56351c116ce6703deb953164f2bff21c71e30 error="open /run/vc/sbs/e9ae7c4acd26400ce36d98384cd56351c116ce6703deb953164f2bff21c71e30/devices.json: no such file or directory" name=kata-runtime pid=26370 sandbox=e9ae7c4acd26400ce36d98384cd56351c116ce6703deb953164f2bff21c71e30 sandboxid=e9ae7c4acd26400ce36d98384cd56351c116ce6703deb953164f2bff21c71e30 source=virtcontainers subsystem=sandbox
time="2018-08-21T14:58:26.598784837+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=e9ae7c4acd26400ce36d98384cd56351c116ce6703deb953164f2bff21c71e30 name=kata-runtime pid=26484 sandbox=e9ae7c4acd26400ce36d98384cd56351c116ce6703deb953164f2bff21c71e30 source=runtime
time="2018-08-21T14:58:26.645697374+02:00" level=error msg="Container e9ae7c4acd26400ce36d98384cd56351c116ce6703deb953164f2bff21c71e30 not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=e9ae7c4acd26400ce36d98384cd56351c116ce6703deb953164f2bff21c71e30 name=kata-runtime pid=26522 sandbox=e9ae7c4acd26400ce36d98384cd56351c116ce6703deb953164f2bff21c71e30 source=runtime
time="2018-08-21T14:58:29.242554988+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=4da28275c598a4d1b28903535ec669b9e1cab781a4dc3ac3e55acfe734b096c2 error="open /run/vc/sbs/4da28275c598a4d1b28903535ec669b9e1cab781a4dc3ac3e55acfe734b096c2/devices.json: no such file or directory" name=kata-runtime pid=26608 sandbox=4da28275c598a4d1b28903535ec669b9e1cab781a4dc3ac3e55acfe734b096c2 sandboxid=4da28275c598a4d1b28903535ec669b9e1cab781a4dc3ac3e55acfe734b096c2 source=virtcontainers subsystem=sandbox
time="2018-08-21T14:58:30.441052792+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=4da28275c598a4d1b28903535ec669b9e1cab781a4dc3ac3e55acfe734b096c2 name=kata-runtime pid=26721 sandbox=4da28275c598a4d1b28903535ec669b9e1cab781a4dc3ac3e55acfe734b096c2 source=runtime
time="2018-08-21T14:58:30.474084251+02:00" level=error msg="Container 4da28275c598a4d1b28903535ec669b9e1cab781a4dc3ac3e55acfe734b096c2 not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=4da28275c598a4d1b28903535ec669b9e1cab781a4dc3ac3e55acfe734b096c2 name=kata-runtime pid=26759 sandbox=4da28275c598a4d1b28903535ec669b9e1cab781a4dc3ac3e55acfe734b096c2 source=runtime
time="2018-08-21T14:58:33.098499679+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=3e0cc4039dcc1bc39f5520ad71525c20563b7ebbead58fee9bb59f23231398c9 error="open /run/vc/sbs/3e0cc4039dcc1bc39f5520ad71525c20563b7ebbead58fee9bb59f23231398c9/devices.json: no such file or directory" name=kata-runtime pid=26843 sandbox=3e0cc4039dcc1bc39f5520ad71525c20563b7ebbead58fee9bb59f23231398c9 sandboxid=3e0cc4039dcc1bc39f5520ad71525c20563b7ebbead58fee9bb59f23231398c9 source=virtcontainers subsystem=sandbox
time="2018-08-21T14:58:35.625510954+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=3e0cc4039dcc1bc39f5520ad71525c20563b7ebbead58fee9bb59f23231398c9 name=kata-runtime pid=26954 sandbox=3e0cc4039dcc1bc39f5520ad71525c20563b7ebbead58fee9bb59f23231398c9 source=runtime
time="2018-08-21T14:58:35.657393026+02:00" level=error msg="Container 3e0cc4039dcc1bc39f5520ad71525c20563b7ebbead58fee9bb59f23231398c9 not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=3e0cc4039dcc1bc39f5520ad71525c20563b7ebbead58fee9bb59f23231398c9 name=kata-runtime pid=26992 sandbox=3e0cc4039dcc1bc39f5520ad71525c20563b7ebbead58fee9bb59f23231398c9 source=runtime
time="2018-08-21T14:58:38.346480413+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=9ea446f9b1b4947d1e924bdce6a5b377c3c90d062d1d36027a29e3ed4aecacc9 error="open /run/vc/sbs/9ea446f9b1b4947d1e924bdce6a5b377c3c90d062d1d36027a29e3ed4aecacc9/devices.json: no such file or directory" name=kata-runtime pid=27077 sandbox=9ea446f9b1b4947d1e924bdce6a5b377c3c90d062d1d36027a29e3ed4aecacc9 sandboxid=9ea446f9b1b4947d1e924bdce6a5b377c3c90d062d1d36027a29e3ed4aecacc9 source=virtcontainers subsystem=sandbox
time="2018-08-21T14:58:39.668224472+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=9ea446f9b1b4947d1e924bdce6a5b377c3c90d062d1d36027a29e3ed4aecacc9 name=kata-runtime pid=27192 sandbox=9ea446f9b1b4947d1e924bdce6a5b377c3c90d062d1d36027a29e3ed4aecacc9 source=runtime
time="2018-08-21T14:58:39.694755685+02:00" level=error msg="Container 9ea446f9b1b4947d1e924bdce6a5b377c3c90d062d1d36027a29e3ed4aecacc9 not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=9ea446f9b1b4947d1e924bdce6a5b377c3c90d062d1d36027a29e3ed4aecacc9 name=kata-runtime pid=27232 sandbox=9ea446f9b1b4947d1e924bdce6a5b377c3c90d062d1d36027a29e3ed4aecacc9 source=runtime
time="2018-08-21T14:58:42.430860882+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=783a53f432c86b1c84d79c7c56aac514ddddcba07f058367b5fdc45daf10d7cb error="open /run/vc/sbs/783a53f432c86b1c84d79c7c56aac514ddddcba07f058367b5fdc45daf10d7cb/devices.json: no such file or directory" name=kata-runtime pid=27314 sandbox=783a53f432c86b1c84d79c7c56aac514ddddcba07f058367b5fdc45daf10d7cb sandboxid=783a53f432c86b1c84d79c7c56aac514ddddcba07f058367b5fdc45daf10d7cb source=virtcontainers subsystem=sandbox
time="2018-08-21T14:58:44.959438652+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=783a53f432c86b1c84d79c7c56aac514ddddcba07f058367b5fdc45daf10d7cb name=kata-runtime pid=27425 sandbox=783a53f432c86b1c84d79c7c56aac514ddddcba07f058367b5fdc45daf10d7cb source=runtime
time="2018-08-21T14:58:45.015123722+02:00" level=error msg="Container 783a53f432c86b1c84d79c7c56aac514ddddcba07f058367b5fdc45daf10d7cb not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=783a53f432c86b1c84d79c7c56aac514ddddcba07f058367b5fdc45daf10d7cb name=kata-runtime pid=27462 sandbox=783a53f432c86b1c84d79c7c56aac514ddddcba07f058367b5fdc45daf10d7cb source=runtime
time="2018-08-21T14:58:47.658561177+02:00" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=f9cee1d6415682121772d03cac25f35c78459622eb15feffcf9a2383f4164da4 error="open /run/vc/sbs/f9cee1d6415682121772d03cac25f35c78459622eb15feffcf9a2383f4164da4/devices.json: no such file or directory" name=kata-runtime pid=27551 sandbox=f9cee1d6415682121772d03cac25f35c78459622eb15feffcf9a2383f4164da4 sandboxid=f9cee1d6415682121772d03cac25f35c78459622eb15feffcf9a2383f4164da4 source=virtcontainers subsystem=sandbox
time="2018-08-21T14:58:48.916712516+02:00" level=error msg="Container not ready, running or paused, impossible to signal the container" arch=amd64 command=kill container=f9cee1d6415682121772d03cac25f35c78459622eb15feffcf9a2383f4164da4 name=kata-runtime pid=27665 sandbox=f9cee1d6415682121772d03cac25f35c78459622eb15feffcf9a2383f4164da4 source=runtime
time="2018-08-21T14:58:48.944151551+02:00" level=error msg="Container f9cee1d6415682121772d03cac25f35c78459622eb15feffcf9a2383f4164da4 not ready, running or paused, cannot send a signal" arch=amd64 command=kill container=f9cee1d6415682121772d03cac25f35c78459622eb15feffcf9a2383f4164da4 name=kata-runtime pid=27702 sandbox=f9cee1d6415682121772d03cac25f35c78459622eb15feffcf9a2383f4164da4 source=runtime

Proxy logs

Recent proxy problems found in system journal:

time="2018-08-13T22:47:03.237277869+02:00" level=fatal msg="accept unix /run/vc/sbs/3df479a1921c0073164a77c41e8dc2024f3c4754d33485f5ea17640aba1fca7e/proxy.sock: use of closed network connection" name=kata-proxy pid=23295 source=proxy
time="2018-08-14T07:24:28.304574442+02:00" level=fatal msg="failed to handle exit signal" error="close unix @->/run/vc/vm/cebd50d374f4cb555b271fc33945ff035a7a9f246da0f1e1682e9272c64fdba4/kata.sock: use of closed network connection" name=kata-proxy pid=29376 sandbox=cebd50d374f4cb555b271fc33945ff035a7a9f246da0f1e1682e9272c64fdba4 source=proxy
time="2018-08-14T07:33:42.149945593+02:00" level=fatal msg="failed to handle exit signal" error="close unix @->/run/vc/vm/9cb678bf8aba08a75fffd4ccb0e653818ee318dd2d8f0768b198b56e182c8d2f/kata.sock: use of closed network connection" name=kata-proxy pid=29769 sandbox=9cb678bf8aba08a75fffd4ccb0e653818ee318dd2d8f0768b198b56e182c8d2f source=proxy
time="2018-08-19T22:07:36.100825623+02:00" level=fatal msg="failed to handle exit signal" error="close unix @->/run/vc/vm/74f08490f381dacf7368edcdb1cc89d4ca56f6e85eefcdf701290ed6486b2767/kata.sock: use of closed network connection" name=kata-proxy pid=6890 sandbox=74f08490f381dacf7368edcdb1cc89d4ca56f6e85eefcdf701290ed6486b2767 source=proxy
time="2018-08-19T22:50:18.125315689+02:00" level=fatal msg="failed to handle exit signal" error="close unix @->/run/vc/vm/006dcddfefca4a93aca56ae27a51ec1ba2f4418d7be40bced39b6bc022af4630/kata.sock: use of closed network connection" name=kata-proxy pid=9623 sandbox=006dcddfefca4a93aca56ae27a51ec1ba2f4418d7be40bced39b6bc022af4630 source=proxy
time="2018-08-20T10:35:51.841530562+02:00" level=fatal msg="channel error" error="accept unix /run/vc/sbs/0a3b513827e5a6d39f6af75b275d291f24e2dff6874dbe22b1d3633bafec442d/proxy.sock: use of closed network connection" name=kata-proxy pid=18670 sandbox=0a3b513827e5a6d39f6af75b275d291f24e2dff6874dbe22b1d3633bafec442d source=proxy
time="2018-08-20T10:55:43.374802023+02:00" level=fatal msg="failed to handle exit signal" error="close unix @->/run/vc/vm/675188c93bad9011c5c88df5869f3e41361a15c9f06f0384b660275bcae87813/kata.sock: use of closed network connection" name=kata-proxy pid=20447 sandbox=675188c93bad9011c5c88df5869f3e41361a15c9f06f0384b660275bcae87813 source=proxy
time="2018-08-21T08:26:23.775046843+02:00" level=fatal msg="failed to handle exit signal" error="close unix @->/run/vc/vm/32dcb43c7b26169928269a705067f6ca909200c3dbeef668f236c9174eb3399a/kata.sock: use of closed network connection" name=kata-proxy pid=28729 sandbox=32dcb43c7b26169928269a705067f6ca909200c3dbeef668f236c9174eb3399a source=proxy

Shim logs

Recent shim problems found in system journal:

time="2018-08-21T12:55:37.892901207+02:00" level=error msg="set window size failed" container=d440e0e7a62f87529a231c4785fa58528b9d172d7ecb7aa46d49514293d499fc error="rpc error: code = NotFound desc = Process d440e0e7a62f87529a231c4785fa58528b9d172d7ecb7aa46d49514293d499fc not found (container d440e0e7a62f87529a231c4785fa58528b9d172d7ecb7aa46d49514293d499fc)" exec-id=d440e0e7a62f87529a231c4785fa58528b9d172d7ecb7aa46d49514293d499fc name=kata-shim pid=1 source=shim window-height=43 window-width=168
time="2018-08-21T13:19:19.731210261+02:00" level=error msg="set window size failed" container=02ab5e57ba800daa46ebf01ee7dab38ad4da179939198e7d81f26d598568de83 error="rpc error: code = NotFound desc = Process 02ab5e57ba800daa46ebf01ee7dab38ad4da179939198e7d81f26d598568de83 not found (container 02ab5e57ba800daa46ebf01ee7dab38ad4da179939198e7d81f26d598568de83)" exec-id=02ab5e57ba800daa46ebf01ee7dab38ad4da179939198e7d81f26d598568de83 name=kata-shim pid=1 source=shim window-height=43 window-width=168
time="2018-08-21T14:51:45.133503084+02:00" level=error msg="set window size failed" container=e0d60536eef22adcaf87ec26bd5b3756549db7f59fd0c217e1a1b7d2c7cdd5e4 error="rpc error: code = NotFound desc = Process e0d60536eef22adcaf87ec26bd5b3756549db7f59fd0c217e1a1b7d2c7cdd5e4 not found (container e0d60536eef22adcaf87ec26bd5b3756549db7f59fd0c217e1a1b7d2c7cdd5e4)" exec-id=e0d60536eef22adcaf87ec26bd5b3756549db7f59fd0c217e1a1b7d2c7cdd5e4 name=kata-shim pid=1 source=shim window-height=43 window-width=168
time="2018-08-21T14:53:59.196127979+02:00" level=error msg="set window size failed" container=2da70571ff4418296308d792d37f315697cc4b03afdc22857e5e0f459338c2c5 error="rpc error: code = NotFound desc = Process 2da70571ff4418296308d792d37f315697cc4b03afdc22857e5e0f459338c2c5 not found (container 2da70571ff4418296308d792d37f315697cc4b03afdc22857e5e0f459338c2c5)" exec-id=2da70571ff4418296308d792d37f315697cc4b03afdc22857e5e0f459338c2c5 name=kata-shim pid=1 source=shim window-height=43 window-width=168
time="2018-08-21T14:55:35.426213175+02:00" level=error msg="set window size failed" container=bd8f97b41ed4ae0150afc00711235c89520933ca9fd0eb0bb1bf1c031fb136b4 error="rpc error: code = NotFound desc = Process bd8f97b41ed4ae0150afc00711235c89520933ca9fd0eb0bb1bf1c031fb136b4 not found (container bd8f97b41ed4ae0150afc00711235c89520933ca9fd0eb0bb1bf1c031fb136b4)" exec-id=bd8f97b41ed4ae0150afc00711235c89520933ca9fd0eb0bb1bf1c031fb136b4 name=kata-shim pid=1 source=shim window-height=43 window-width=168
time="2018-08-21T14:56:15.452887111+02:00" level=error msg="set window size failed" container=366e5b1829a7b203ba5bc5881150dd1a06a823ce6c0a0c637804ff2ccd2a61c5 error="rpc error: code = NotFound desc = Process 366e5b1829a7b203ba5bc5881150dd1a06a823ce6c0a0c637804ff2ccd2a61c5 not found (container 366e5b1829a7b203ba5bc5881150dd1a06a823ce6c0a0c637804ff2ccd2a61c5)" exec-id=366e5b1829a7b203ba5bc5881150dd1a06a823ce6c0a0c637804ff2ccd2a61c5 name=kata-shim pid=1 source=shim window-height=43 window-width=168

Container manager details

Have docker

Docker

Output of "docker version":

Client:
 Version:           18.06.0-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        0ffa825
 Built:             Wed Jul 18 19:09:54 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.0-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       0ffa825
  Built:            Wed Jul 18 19:07:56 2018
  OS/Arch:          linux/amd64
  Experimental:     true

Output of "docker info":

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 5
Server Version: 18.06.0-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host ipvlan macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: kata kata-fuse kata-vsock runc
Default Runtime: runc
Init Binary: docker-init
containerd version: d64c661f1d51c48782c9cec8fda7604785f93587
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.15.0-1015-oem
Operating System: Ubuntu 18.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.39GiB
Name: hix
ID: 5KVP:NQVW:INXQ:72AD:R5B2:NV3Q:HYAK:SBCD:3BIK:HWWE:PKEM:STEI
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: true
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

Output of "systemctl show docker":

Type=notify
Restart=on-failure
NotifyAccess=main
RestartUSec=100ms
TimeoutStartUSec=infinity
TimeoutStopUSec=1min 30s
RuntimeMaxUSec=infinity
WatchdogUSec=0
WatchdogTimestamp=Tue 2018-08-21 14:47:52 CEST
WatchdogTimestampMonotonic=6730401
PermissionsStartOnly=no
RootDirectoryStartOnly=no
RemainAfterExit=no
GuessMainPID=yes
MainPID=1061
ControlPID=0
FileDescriptorStoreMax=0
NFileDescriptorStore=0
StatusErrno=0
Result=success
UID=[not set]
GID=[not set]
NRestarts=0
ExecMainStartTimestamp=Tue 2018-08-21 14:47:50 CEST
ExecMainStartTimestampMonotonic=5074001
ExecMainExitTimestampMonotonic=0
ExecMainPID=1061
ExecMainCode=0
ExecMainStatus=0
ExecStart={ path=/usr/bin/dockerd ; argv[]=/usr/bin/dockerd -H fd:// ; ignore_errors=no ; start_time=[Tue 2018-08-21 14:47:50 CEST] ; stop_time=[n/a] ; pid=1061 ; code=(null) ; status=0/0 }
ExecReload={ path=/bin/kill ; argv[]=/bin/kill -s HUP $MAINPID ; ignore_errors=no ; start_time=[n/a] ; stop_time=[n/a] ; pid=0 ; code=(null) ; status=0/0 }
Slice=system.slice
ControlGroup=/system.slice/docker.service
MemoryCurrent=[not set]
CPUUsageNSec=[not set]
TasksCurrent=57
IPIngressBytes=18446744073709551615
IPIngressPackets=18446744073709551615
IPEgressBytes=18446744073709551615
IPEgressPackets=18446744073709551615
Delegate=yes
DelegateControllers=cpu cpuacct io blkio memory devices pids
CPUAccounting=no
CPUWeight=[not set]
StartupCPUWeight=[not set]
CPUShares=[not set]
StartupCPUShares=[not set]
CPUQuotaPerSecUSec=infinity
IOAccounting=no
IOWeight=[not set]
StartupIOWeight=[not set]
BlockIOAccounting=no
BlockIOWeight=[not set]
StartupBlockIOWeight=[not set]
MemoryAccounting=no
MemoryLow=0
MemoryHigh=infinity
MemoryMax=infinity
MemorySwapMax=infinity
MemoryLimit=infinity
DevicePolicy=auto
TasksAccounting=yes
TasksMax=infinity
IPAccounting=no
UMask=0022
LimitCPU=infinity
LimitCPUSoft=infinity
LimitFSIZE=infinity
LimitFSIZESoft=infinity
LimitDATA=infinity
LimitDATASoft=infinity
LimitSTACK=infinity
LimitSTACKSoft=8388608
LimitCORE=infinity
LimitCORESoft=infinity
LimitRSS=infinity
LimitRSSSoft=infinity
LimitNOFILE=1048576
LimitNOFILESoft=1048576
LimitAS=infinity
LimitASSoft=infinity
LimitNPROC=infinity
LimitNPROCSoft=infinity
LimitMEMLOCK=16777216
LimitMEMLOCKSoft=16777216
LimitLOCKS=infinity
LimitLOCKSSoft=infinity
LimitSIGPENDING=62610
LimitSIGPENDINGSoft=62610
LimitMSGQUEUE=819200
LimitMSGQUEUESoft=819200
LimitNICE=0
LimitNICESoft=0
LimitRTPRIO=0
LimitRTPRIOSoft=0
LimitRTTIME=infinity
LimitRTTIMESoft=infinity
OOMScoreAdjust=0
Nice=0
IOSchedulingClass=0
IOSchedulingPriority=0
CPUSchedulingPolicy=0
CPUSchedulingPriority=0
TimerSlackNSec=50000
CPUSchedulingResetOnFork=no
NonBlocking=no
StandardInput=null
StandardInputData=
StandardOutput=journal
StandardError=inherit
TTYReset=no
TTYVHangup=no
TTYVTDisallocate=no
SyslogPriority=30
SyslogLevelPrefix=yes
SyslogLevel=6
SyslogFacility=3
LogLevelMax=-1
SecureBits=0
CapabilityBoundingSet=cap_chown cap_dac_override cap_dac_read_search cap_fowner cap_fsetid cap_kill cap_setgid cap_setuid cap_setpcap cap_linux_immutable cap_net_bind_service cap_net_broadcast cap_net_admin cap_net_raw cap_ipc_lock cap_ipc_owner cap_sys_module cap_sys_rawio cap_sys_chroot cap_sys_ptrace cap_sys_pacct cap_sys_admin cap_sys_boot cap_sys_nice cap_sys_resource cap_sys_time cap_sys_tty_config cap_mknod cap_lease cap_audit_write cap_audit_control cap_setfcap cap_mac_override cap_mac_admin cap_syslog cap_wake_alarm cap_block_suspend
AmbientCapabilities=
DynamicUser=no
RemoveIPC=no
MountFlags=
PrivateTmp=no
PrivateDevices=no
ProtectKernelTunables=no
ProtectKernelModules=no
ProtectControlGroups=no
PrivateNetwork=no
PrivateUsers=no
ProtectHome=no
ProtectSystem=no
SameProcessGroup=no
UtmpMode=init
IgnoreSIGPIPE=yes
NoNewPrivileges=no
SystemCallErrorNumber=0
LockPersonality=no
RuntimeDirectoryPreserve=no
RuntimeDirectoryMode=0755
StateDirectoryMode=0755
CacheDirectoryMode=0755
LogsDirectoryMode=0755
ConfigurationDirectoryMode=0755
MemoryDenyWriteExecute=no
RestrictRealtime=no
RestrictNamespaces=no
MountAPIVFS=no
KeyringMode=private
KillMode=process
KillSignal=15
SendSIGKILL=yes
SendSIGHUP=no
Id=docker.service
Names=docker.service
Requires=docker.socket system.slice sysinit.target
Wants=network-online.target
WantedBy=multi-user.target
ConsistsOf=docker.socket
Conflicts=shutdown.target
Before=shutdown.target multi-user.target
After=basic.target sysinit.target firewalld.service system.slice network-online.target docker.socket systemd-journald.socket
TriggeredBy=docker.socket
Documentation=https://docs.docker.com
Description=Docker Application Container Engine
LoadState=loaded
ActiveState=active
SubState=running
FragmentPath=/lib/systemd/system/docker.service
UnitFileState=enabled
UnitFilePreset=enabled
StateChangeTimestamp=Tue 2018-08-21 14:47:52 CEST
StateChangeTimestampMonotonic=6730402
InactiveExitTimestamp=Tue 2018-08-21 14:47:50 CEST
InactiveExitTimestampMonotonic=5074024
ActiveEnterTimestamp=Tue 2018-08-21 14:47:52 CEST
ActiveEnterTimestampMonotonic=6730402
ActiveExitTimestampMonotonic=0
InactiveEnterTimestampMonotonic=0
CanStart=yes
CanStop=yes
CanReload=yes
CanIsolate=no
StopWhenUnneeded=no
RefuseManualStart=no
RefuseManualStop=no
AllowIsolate=no
DefaultDependencies=yes
OnFailureJobMode=replace
IgnoreOnIsolate=no
NeedDaemonReload=no
JobTimeoutUSec=infinity
JobRunningTimeoutUSec=infinity
JobTimeoutAction=none
ConditionResult=yes
AssertResult=yes
ConditionTimestamp=Tue 2018-08-21 14:47:50 CEST
ConditionTimestampMonotonic=5073514
AssertTimestamp=Tue 2018-08-21 14:47:50 CEST
AssertTimestampMonotonic=5073514
Transient=no
Perpetual=no
StartLimitIntervalUSec=1min
StartLimitBurst=3
StartLimitAction=none
FailureAction=none
SuccessAction=none
InvocationID=1ba13ee3a8424c1f9ee5b399db62a637
CollectMode=inactive

No kubectl


Packages

Have dpkg
Output of "dpkg -l|egrep "(cc-oci-runtimecc-runtimerunv|kata-proxy|kata-runtime|kata-shim|kata-containers-image|linux-container|qemu-)"":

ii  ipxe-qemu-256k-compat-efi-roms             1.0.0+git-20150424.a25a16d-0ubuntu2  all          PXE boot firmware - Compat EFI ROM images for qemu
ii  kata-containers-image                      1.2.0-32                             amd64        Kata containers image
ii  kata-linux-container                       4.14.51.7-134                        amd64        linux kernel optimised for container-like workloads.
ii  kata-proxy                                 1.2.0+git.1796218-32                 amd64        
ii  kata-runtime                               1.2.0+git.0bcb32f-45                 amd64        
ii  kata-shim                                  1.2.0+git.0a37760-33                 amd64        
ii  qemu-block-extra:amd64                     1:2.11+dfsg-1ubuntu7.4               amd64        extra block backend modules for qemu-system and qemu-utils
ii  qemu-lite                                  2.11.0+git.a39e0b3e82-47             amd64        linux kernel optimised for container-like workloads.
ii  qemu-slof                                  20170724+dfsg-1ubuntu1               all          Slimline Open Firmware -- QEMU PowerPC version
ii  qemu-system                                1:2.11+dfsg-1ubuntu7.4               amd64        QEMU full system emulation binaries
ii  qemu-system-arm                            1:2.11+dfsg-1ubuntu7.4               amd64        QEMU full system emulation binaries (arm)
ii  qemu-system-common                         1:2.11+dfsg-1ubuntu7.4               amd64        QEMU full system emulation binaries (common files)
ii  qemu-system-mips                           1:2.11+dfsg-1ubuntu7.4               amd64        QEMU full system emulation binaries (mips)
ii  qemu-system-misc                           1:2.11+dfsg-1ubuntu7.4               amd64        QEMU full system emulation binaries (miscellaneous)
ii  qemu-system-ppc                            1:2.11+dfsg-1ubuntu7.4               amd64        QEMU full system emulation binaries (ppc)
ii  qemu-system-s390x                          1:2.11+dfsg-1ubuntu7.4               amd64        QEMU full system emulation binaries (s390x)
ii  qemu-system-sparc                          1:2.11+dfsg-1ubuntu7.4               amd64        QEMU full system emulation binaries (sparc)
ii  qemu-system-x86                            1:2.11+dfsg-1ubuntu7.4               amd64        QEMU full system emulation binaries (x86)
ii  qemu-user                                  1:2.11+dfsg-1ubuntu7.4               amd64        QEMU user mode emulation binaries
ii  qemu-user-static                           1:2.11+dfsg-1ubuntu7.4               amd64        QEMU user mode emulation binaries (static version)
ii  qemu-utils                                 1:2.11+dfsg-1ubuntu7.4               amd64        QEMU utilities
ii  qemu-vanilla                               2.11.2+git.a39e0b3e82-44             amd64        linux kernel optimised for container-like workloads.

No rpm


@devimc devimc self-assigned this Aug 21, 2018
@devimc
Copy link

devimc commented Aug 21, 2018

@maximilianriemensberger thanks for reporting this, I'll take a look

@devimc
Copy link

devimc commented Sep 17, 2018

after debugging this issue I found that the connection created to destroy the POD is leaked, removing the calll to stopSandbox fixes this issue (unfortunately we need this call 😄), I replaced the grpc request grpc.DestroySandboxRequest to check but the connection is still leaked. I tried to save all connections created when vsockDialer is called to close them latter, but that didn't work

diff --git a/vendor/github.com/kata-containers/agent/protocols/client/client.go b/vendor/github.com/kata-containers/agent/protocols/client/client.go
index 6e386fe..2862b4c 100644
--- a/vendor/github.com/kata-containers/agent/protocols/client/client.go
+++ b/vendor/github.com/kata-containers/agent/protocols/client/client.go
@@ -8,10 +8,12 @@ package client
 
 import (
 	"context"
+	"fmt"
 	"net"
 	"net/url"
 	"strconv"
 	"strings"
+	"sync"
 	"time"
 
 	"github.com/grpc-ecosystem/grpc-opentracing/go/otgrpc"
@@ -31,6 +33,9 @@ const (
 )
 
 var defaultDialTimeout = 15 * time.Second
+var defaultCloseTimeout = 5 * time.Second
+var vsockConnectionsLock sync.Mutex
+var vsockConnections []net.Conn
 
 // AgentClient is an agent gRPC client connection wrapper for agentgrpc.AgentServiceClient
 type AgentClient struct {
@@ -45,7 +50,26 @@ type yamuxSessionStream struct {
 }
 
 func (y *yamuxSessionStream) Close() error {
-	return y.session.Close()
+	waitCh := y.session.CloseChan()
+	timeout := time.NewTimer(defaultCloseTimeout)
+
+	if err := y.Conn.Close(); err != nil {
+		return err
+	}
+
+	if err := y.session.Close(); err != nil {
+		return err
+	}
+
+	// block until session is really closed
+	select {
+	case <-waitCh:
+		timeout.Stop()
+	case <-timeout.C:
+		return fmt.Errorf("timeout waiting for session close")
+	}
+
+	return nil
 }
 
 type dialer func(string, time.Duration) (net.Conn, error)
@@ -196,6 +220,10 @@ func agentDialer(addr *url.URL, enableYamux bool) dialer {
 }
 
 func unixDialer(sock string, timeout time.Duration) (net.Conn, error) {
+	if strings.HasPrefix(sock, "unix:") {
+		sock = strings.Trim(sock, "unix:")
+	}
+
 	dialFunc := func() (net.Conn, error) {
 		return net.DialTimeout("unix", sock, timeout)
 	}
@@ -264,11 +292,16 @@ func commonDialer(timeout time.Duration, dialFunc func() (net.Conn, error), time
 		if !ok {
 			return nil, timeoutErrMsg
 		}
+	case <-t.C:
+		cancel <- true
+		return nil, timeoutErrMsg
 	}
 
 	return conn, nil
 }
 
+var panicIfDial = false
+
 func vsockDialer(sock string, timeout time.Duration) (net.Conn, error) {
 	cid, port, err := parseGrpcVsockAddr(sock)
 	if err != nil {
@@ -276,10 +309,33 @@ func vsockDialer(sock string, timeout time.Duration) (net.Conn, error) {
 	}
 
 	dialFunc := func() (net.Conn, error) {
-		return vsock.Dial(cid, port)
+		conn, err := vsock.Dial(cid, port)
+		if err != nil {
+			return nil, err
+		}
+
+		vsockConnectionsLock.Lock()
+		if panicIfDial {
+			panic("panicIfDial")
+		}
+		vsockConnections = append(vsockConnections, conn)
+		vsockConnectionsLock.Unlock()
+		return conn, nil
 	}
 
 	timeoutErr := grpcStatus.Errorf(codes.DeadlineExceeded, "timed out connecting to vsock %d:%d", cid, port)
 
 	return commonDialer(timeout, dialFunc, timeoutErr)
 }
+
+// CloseVSockConnections
+func CloseVSockConnections() error {
+	vsockConnectionsLock.Lock()
+	for _, c := range vsockConnections {
+		c.Close()
+	}
+	vsockConnections = []net.Conn{}
+	panicIfDial = true
+	vsockConnectionsLock.Unlock()
+	return nil
+}

if someone else has any idea about what's going on, please take a look

@devimc
Copy link

devimc commented Sep 18, 2018

cc @kata-containers/runtime @bergwolf @sboeuf @jcvenegas

@bergwolf
Copy link
Member

Could you check if there are any dangling kata-shim or kata-runtime processes? Since otherwise I would expect at least the host kernel will close any remaining connections when cleaning up exited processes.

@devimc
Copy link

devimc commented Sep 19, 2018

@bergwolf there are no processes running, I tried to close all connections before exit using the above patch, but that didn't work.

@bergwolf
Copy link
Member

@devimc Does lsof show up any suspects? It feels more like a kernel bug since no process is running holding the connection fd.

@amshinde
Copy link
Member

amshinde commented Oct 9, 2018

@devimc As per @bergwolf's suggestion, does running lsof show anything useful?

@grahamwhaley
Copy link
Contributor

Well, the good news (I guess) is that I can reproduce this fairly easily using the above script example (thanks @maximilianriemensberger ).
To answer @amshinde @bergwolf - afaict, we have no kata components left over at the end of the run (no shim/proxy/qemu/runtime).
I'm working on looking at a combination of ss, lsof and ps to see if I can spot any key differences, and will see if I can identify what has the vsock still active (if that is really what we are seeing).
Comparing lsof and ps outputs is a touch tricky, as they are quite large and noisy. One thing that I had considered, and does seem to be true from a pre/post ps diff is that we have some [kworker] threads around that we didn't before the test. Now, iirc, kworkers can/do hang around in a sort of pre-emptive cache manner, so this may not be a real 'clue'.... but....
I'm going to pull @stefanha in on this thread, as the author of vsock :-), I suspect he may have a much better holistic insight into this (and clues of what to look for or how to verify/disprove etc.).

I will keep digging for the rest of the day. One oddity to note is that I never (so far) see more than 3 or 4 vsocks being listed by ss after the test - even if I re-run the test. Maybe that correlates with the number of kthreads and the number of cpus on my test system (4). Just a thought/anomaly/note.

@grahamwhaley
Copy link
Contributor

I had another peak. No real new info, but I'll drop some details of what I see here.

The script I ran, based on the original above, is:

#!/bin/bash

set -x
set -e

sudo modprobe -i vhost_vsock

ps -eo pid,user,args > ps_pre.txt
lsmod > lsmod_pre.txt
sudo lsof > lsof_pre.txt

for ((i=1; i<=100; i++)); do
        echo "# Run $(printf "%3d\n" $i)"
        docker run --runtime kata-runtime -it --rm ubuntu bash -c 'true'
        sleep 2
        lsmod | grep ^vhost_vsock
        ss -ip --vsock
done

ps -eo pid,user,args > ps_post.txt
lsmod > lsmod_post.txt
sudo lsof > lsof_post.txt

Running that (on a Fedora machine with kernel 4.17.3-100.fc27.x86_64), I almost always see 1 to 3 vsocks reported by ss at the end of the run.
To do some sanity checks, I ran up a docker busybox in another window. Here is what I see:

$ ss -p --vsock
Netid   State    Recv-Q    Send-Q          Local Address:Port            Peer Address:Port
v_str   ESTAB    0         0                           2:985217            3282304527:1024
v_str   ESTAB    0         0                           2:1252936           4243928522:1024
v_str   ESTAB    0         0                           2:1252938           4243928522:1024

$ ps -ef | fgrep 42439
root     18776 18737  0 17:49 ?        00:00:03 /usr/bin/qemu-system-x86_64 -name sandbox-7bb82118f58c392fde41b848a388d27d9c9046089c559236ed3205f4d541dc15 -uuid baae5f93-fccd-4cab-ab3b-21fc440a6e2d -machine pc,accel=kvm,kernel_irqchip,nvdimm -cpu host -qmp unix:/run/vc/vm/7bb82118f58c392fde41b848a388d27d9c9046089c559236ed3205f4d541dc15/qmp.sock,server,nowait -m 2048M,slots=10,maxmem=33081M -device pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= -device virtio-serial-pci,id=serial0,romfile= -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/vc/vm/7bb82118f58c392fde41b848a388d27d9c9046089c559236ed3205f4d541dc15/console.sock,server,nowait -device nvdimm,id=nv0,memdev=mem0 -object memory-backend-file,id=mem0,mem-path=/usr/share/kata-containers/kata-containers-2018-10-26-10:33:49.086322915+0100-osbuilder-b9b9410-agent-e395ac6,size=134217728 -device virtio-scsi-pci,id=scsi0,romfile= -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng,rng=rng0,romfile= -device vhost-vsock-pci,vhostfd=3,id=vsock-4243928522,guest-cid=4243928522,romfile= -device virtio-9p-pci,fsdev=extra-9p-kataShared,mount_tag=kataShared,romfile= -fsdev local,id=extra-9p-kataShared,path=/run/kata-containers/shared/sandboxes/7bb82118f58c392fde41b848a388d27d9c9046089c559236ed3205f4d541dc15,security_model=none -netdev tap,id=network-0,vhost=on,vhostfds=4,fds=5 -device driver=virtio-net-pci,netdev=network-0,mac=02:42:ac:11:00:02,mq=on,vectors=4,romfile= -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic -daemonize -kernel /usr/share/kata-containers/vmlinuz-4.14.67-16 -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 pci=lastbus=0 root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro rw rootfstype=ext4 debug systemd.show_status=true systemd.log_level=debug panic=1 nr_cpus=4 init=/usr/lib/systemd/systemd systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket agent.log=debug -smp 1,cores=1,threads=1,sockets=1,maxcpus=4
root     18789 18737  0 17:49 pts/2    00:00:00 /usr/libexec/kata-containers/kata-shim -agent vsock://4243928522:1024 -container 7bb82118f58c392fde41b848a388d27d9c9046089c559236ed3205f4d541dc15 -exec-id 7bb82118f58c392fde41b848a388d27d9c9046089c559236ed3205f4d541dc15 -terminal -log debug
gwhaley  19262  1051  0 19:27 pts/1    00:00:00 grep -F --color=auto 42439

$ ps -ef | fgrep 3282
gwhaley  19264  1051  0 19:28 pts/1    00:00:00 grep -F --color=auto 3282

You can see we have the shim and qemu open for the second pair of vsocks, but the first vsock seems to have no process associated with it. afaict, we have no kata related processes running associated with that vsock.

Trying to work this out a bit more, I had a dig around in /proc and /sys to see if I could find any way to track where that vsock was sitting. The nearest I got was finding that the qemu has a handle open to /dev/vhost-vsock - so, if we have a peek to see if anything else has that open...

$ lsof /dev/vhost*
COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
qemu-syst 18776 root    3u   CHR 10,241      0t0 2017 /dev/vhost-vsock
qemu-syst 18776 root    4u   CHR 10,238      0t0 2016 /dev/vhost-net

nope.
I'm getting stuck chasing this - I had a peek at some of the kernel vsock code to see where the close path might be, and see if there was anything obvious (pah!) that might be leaking.
I'm going to ping @stefanha (who may be able to comment next week) and @dagrh, in case they have some thoughts, around:

  • does this look like a real leak, or might this be some odd vsock cache/artifact thing
  • any known issues or recently fixed issues that may account for this?
  • any hints on how to debug on where this is left dangling - what process might have it open etc. We can see the 'cid', but no hints on process pid, and I don't think they are tracked in the vsock struct or printed by ss therefore.

@stefanha
Copy link

Hi @grahamwhaley, I have finally managed to reproduce this and will let you know what I find.

@grahamwhaley
Copy link
Contributor

Excellent, thanks @stefanha . and, welcome back :-)

@stefanha
Copy link

stefanha commented Dec 4, 2018

I can confirm that sockets are leaking. Using packet capture (via the vsockmon module) I can see that it happens for sockets that aren't shut down by the guest before it terminates. This is a race condition.

I'll try to find a minimal reproducer. A vhost_vsock.ko fix is probably necessary.

@grahamwhaley
Copy link
Contributor

nice - well, not nice iyswim, but, thanks @stefanha for hunting that down!

@stefanha
Copy link

stefanha commented Dec 7, 2018

The vhost_vsock.ko fix is available here:
https://www.spinics.net/lists/kvm/msg178577.html

@grahamwhaley
Copy link
Contributor

Thanks @stefanha - yeah, interesting fix :-)
Question now is, do we carry this patch on our VM kernels until it lands upstream and into stable...
/cc @kata-containers/architecture-committee to discuss....
(I suspect the answer will be yes)

@grahamwhaley
Copy link
Contributor

oh, and I guess I should ask really - err, is that a host side fix, or in-VM or even both @stefanha ?

@stefanha
Copy link

stefanha commented Dec 7, 2018

@grahamwhaley It only affects the vhost_vsock.ko host-side kernel module. I suggest keeping an eye on the patch (I've CCed you) until Michael Tsirkin (vhost maintainer in Linux) has merged it into his tree. At that point it's a good bet to ship while waiting for it to land in stable.

@egernst
Copy link
Member

egernst commented Feb 19, 2019

@stefanha - any updates on the patch? Can you share pointer?

@stefanha
Copy link

The fix went into Linux 4.20:

commit c38f57da428b033f2721b611d84b1f40bde674a8
Author: Stefan Hajnoczi [email protected]
Date: Thu Dec 6 19:14:34 2018 +0000

vhost/vsock: fix reset orphans race with close timeout

@gnawux
Copy link
Member

gnawux commented Feb 20, 2019 via email

@stefanha
Copy link

stefanha commented Feb 20, 2019 via email

@egernst
Copy link
Member

egernst commented Feb 20, 2019

available since 4.19.12

sboeuf pushed a commit to sboeuf/runtime-1 that referenced this issue Feb 20, 2019
We need to bump the kernel version from 4.14.67 to 4.19.23 in order
to follow the recent kernel config bump.

Fixes kata-containers#618
Fixes kata-containers#1029

Signed-off-by: Sebastien Boeuf <[email protected]>
sboeuf pushed a commit to sboeuf/runtime-1 that referenced this issue Feb 20, 2019
We need to bump the kernel version from 4.14.67 to 4.19.24 in order
to follow the recent kernel config bump.

Fixes kata-containers#618
Fixes kata-containers#1029

Signed-off-by: Sebastien Boeuf <[email protected]>
egernst pushed a commit to egernst/runtime that referenced this issue Feb 9, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants