Skip to content
This repository has been archived by the owner on Jun 28, 2024. It is now read-only.

ci: Add virtio-fs support #1537

Merged
merged 2 commits into from
Jun 26, 2019

Conversation

chavafg
Copy link
Contributor

@chavafg chavafg commented May 6, 2019

Add configuration option to use virtio-fs.
We will currently use nemu for testing the
virtio-fs support.

Depends-on: github.com/kata-containers/runtime#1639

Fixes: #1536.

Signed-off-by: Salvador Fuentes [email protected]

@ganeshmaharaj
Copy link

Should we add the command to enable on the system too? sysctl nr.huge_pages = 1024

Copy link
Contributor

@grahamwhaley grahamwhaley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@chavafg
Copy link
Contributor Author

chavafg commented May 7, 2019

@ganeshmaharaj addressed your comment
/test-nemu

@chavafg
Copy link
Contributor Author

chavafg commented May 8, 2019

/test-nemu

@chavafg
Copy link
Contributor Author

chavafg commented May 8, 2019

[4] Running command '/usr/bin/docker [docker run --cidfile /tmp/cid844578869/H98I3diZOOuuXmiCkd0939o8fQniT4 --runtime kata-runtime --name H98I3diZOOuuXmiCkd09
39o8fQniT4 -t busybox tty]'
[4] .
[4] command failed error 'exit status 125'
[4] [docker run --cidfile /tmp/cid844578869/H98I3diZOOuuXmiCkd0939o8fQniT4 --runtime kata-runtime --name H98I3diZOOuuXmiCkd0939o8fQniT4 -t busybox tty]
[4] Timeout: 120 seconds
[4] Exit Code: 125
[4] Stdout:
[4] Stderr: docker: Error response from daemon: OCI runtime create failed: qemu-system-x86_64_virt: -object memory-backend-file,id=dimm1,size=2048M,mem-path=/
dev/hugepages,share=on,prealloc=on: unable to map backing store for guest RAM: Cannot allocate memory: unknown.
[4]

Seems like we have memory issues when running with virtiofs, maybe because of hugepages?
I can reproduce locally.
If I modify the default_memory from 2048 to 512, I can run more tests, but at some point I get the same error again.

/cc @egernst @sboeuf @ganeshmaharaj

@sboeuf
Copy link

sboeuf commented May 8, 2019

@ganeshmaharaj is already aware that hugepages are actually not needed, which is why he has a patch in flight I think. If the issue comes from hugepages, this will be solved by not using them because they're not needed per se.

@chavafg
Copy link
Contributor Author

chavafg commented May 9, 2019

/test-nemu

@jcvenegas
Copy link
Member

@chavafg hi, any update?

@chavafg
Copy link
Contributor Author

chavafg commented May 14, 2019

/test-nemu

@ganeshmaharaj
Copy link

/test

@jodh-intel
Copy link
Contributor

Restarted Nemu test which failed with:

21:48:06 [4] • Failure in Spec Teardown (AfterEach) [125.054 seconds]
21:48:06 [4] load
21:48:06 [4] /tmp/jenkins/workspace/kata-containers-tests-ubuntu-nemu/go/src/github.com/kata-containers/tests/integration/docker/load_test.go:15
21:48:06 [4]   load with docker [AfterEach]
21:48:06 [4]   /tmp/jenkins/workspace/kata-containers-tests-ubuntu-nemu/go/src/github.com/kata-containers/tests/integration/docker/load_test.go:36
21:48:06 [4]     load a container
21:48:06 [4]     /tmp/jenkins/workspace/kata-containers-tests-ubuntu-nemu/go/src/github.com/kata-containers/tests/integration/docker/load_test.go:37
21:48:06 [4]       should load image
21:48:06 [4]       /tmp/jenkins/workspace/kata-containers-tests-ubuntu-nemu/go/src/github.com/kata-containers/tests/integration/docker/load_test.go:38
21:48:06 [4] 
21:48:06 [4]       Expected
21:48:06 [4]           <int>: -1
21:48:06 [4]       to equal
21:48:06 [4]           <int>: 0
21:48:06 [4] 
21:48:06 [4]       /tmp/jenkins/workspace/kata-containers-tests-ubuntu-nemu/go/src/github.com/kata-containers/tests/integration/docker/load_test.go:31

Restarted the initrd CI failed with this rather weird set of messages:

22:21:37 not ok 25 ctr execsync std{out,err}
22:21:37 # (in test file ctr.bats, line 946)
22:21:37 #   `[ "$status" -eq 0 ]' failed
22:21:37 # time="2019-05-15T21:21:06Z" level=error msg="File descriptor 3 (pipe:[773499]) leaked on pvdisplay invocation. Parent PID 57196: /tmp/jenkins/workspace/kata-con
22:21:37 # File descriptor 6 (/dev/mapper/control) leaked on pvdisplay invocation. Parent PID 57196: /tmp/jenkins/workspace/kata-con
22:21:37 #   Failed to find physical volume "/dev/sdb".
22:21:37 # " error="exit status 5" 
22:21:37 # 0
22:21:37 # time="2019-05-15 21:21:23.488458146Z" level=debug msg="found valid runtime 'runc' for runtime_path '/usr/local/bin/kata-runtime'
22:21:37 # " 

@chavafg
Copy link
Contributor Author

chavafg commented May 16, 2019

seems like docker memory tests with nemu (which use virtiofs in this PR) are getting hanged. need to check furtherly

@chavafg
Copy link
Contributor Author

chavafg commented May 27, 2019

/test-nemu

# currently we use nemu for virtiofs testing
if [ "$VIRTIO_FS" = true ] && [ "$KATA_HYPERVISOR" = "nemu" ]; then
echo "Configure virtio-fs on kata-runtime config file"
sudo crudini --set "$runtime_config_path" hypervisor.qemu virtio_fs_daemon "\"/usr/local/bin/virtiofsd-${arch}\""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After #1595, maybe we can remove the hardcoding of the virtiofsd binary and use ${KATA_NEMU_DESTDIR}

echo "Configure virtio-fs on kata-runtime config file"
sudo crudini --set "$runtime_config_path" hypervisor.qemu virtio_fs_daemon "\"/usr/local/bin/virtiofsd-${arch}\""
sudo crudini --set "$runtime_config_path" hypervisor.qemu shared_fs "\"virtio-fs\""
sudo crudini --set "$runtime_config_path" hypervisor.qemu enable_hugepages "true"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not longer need this. File based backend patch landed today and that should automatically take care of this part.

@chavafg chavafg force-pushed the topic/virtiofs-support branch 2 times, most recently from 7bb389a to 203786d Compare May 29, 2019 15:37
@chavafg
Copy link
Contributor Author

chavafg commented May 29, 2019

/test-nemu

@chavafg
Copy link
Contributor Author

chavafg commented May 31, 2019

@ganeshmaharaj now the tests are getting hanged on the cri-o tests when trying to create a container inside a pod. I just see timeout error on the kata logs, but the lasts messages are from systemd on kata-proxy log.

I tried to reproduce locally, these are the logs:

May 31 18:15:22 virtiofs-tests kata-proxy[9837]: time="2019-05-31T18:15:22.924103968Z" level=info msg="time=\"2019-05-31T18:15:22.907783871Z\" level=debug m
sg=\"updating cpuset cgroup\" debug_console=false name=kata-agent path=/sys/fs/cgroup/cpuset/Burstable/pod_123-456/crio-77def99ec899addc7a943b30dc3643dcadee
1ea416acb4c3836e91d6833315bb/cpuset.cpus pid=78 sandbox=77def99ec899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb source=agent\n" name=kata-proxy pid
=9837 sandbox=77def99ec899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb source=agent
May 31 18:15:22 virtiofs-tests kata-proxy[9837]: time="2019-05-31T18:15:22.939007715Z" level=info msg="[   41.331225] systemd[1]: libmount event [rescan: ye
s]\n" name=kata-proxy pid=9837 sandbox=77def99ec899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb source=agent
May 31 18:15:22 virtiofs-tests kata-proxy[9837]: time="2019-05-31T18:15:22.941015935Z" level=info msg="[   41.333207] systemd[1]: run.mount: Failed to load
configuration: No such file or directory\n" name=kata-proxy pid=9837 sandbox=77def99ec899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb source=agent
May 31 18:15:22 virtiofs-tests kata-proxy[9837]: time="2019-05-31T18:15:22.941307438Z" level=info msg="[   41.333560] systemd[1]: run-libcontainer.mount: Fa
iled to load configuration: No such file or directory\n" name=kata-proxy pid=9837 sandbox=77def99ec899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb s
ource=agent
May 31 18:15:22 virtiofs-tests kata-proxy[9837]: time="2019-05-31T18:15:22.941628441Z" level=info msg="[   41.333843] systemd[1]: run-libcontainer-77def99ec
899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb.mount: Failed to load configuration: No such file or directory\n" name=kata-proxy pid=9837 sandbox=7
7def99ec899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb source=agent
May 31 18:15:22 virtiofs-tests kata-proxy[9837]: time="2019-05-31T18:15:22.942679752Z" level=info msg="[   41.334818] systemd[1]: run-libcontainer-77def99ec
899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb-a02e080a289b62030927d519256cac214b99e45839eec6aa0064c0c47691d298.mount: Failed to load configuration
: No such file or directory\n" name=kata-proxy pid=9837 sandbox=77def99ec899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb source=agent
May 31 18:15:22 virtiofs-tests kata-proxy[9837]: time="2019-05-31T18:15:22.943669561Z" level=info msg="[   41.335223] systemd[1]: run-libcontainer-77def99ec
899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb-a02e080a289b62030927d519256cac214b99e45839eec6aa0064c0c47691d298-runc.U8dhRV.mount: Changed dead ->
mounted\n" name=kata-proxy pid=9837 sandbox=77def99ec899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb source=agent
May 31 18:15:22 virtiofs-tests kata-proxy[9837]: time="2019-05-31T18:15:22.944318168Z" level=info msg="[   41.336597] systemd[1]: run.mount: Collecting.\n"
name=kata-proxy pid=9837 sandbox=77def99ec899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb source=agent
May 31 18:15:22 virtiofs-tests kata-proxy[9837]: time="2019-05-31T18:15:22.944473069Z" level=info msg="[   41.336836] systemd[1]: run-libcontainer.mount: Co
llecting.\n" name=kata-proxy pid=9837 sandbox=77def99ec899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb source=agent
May 31 18:15:22 virtiofs-tests kata-proxy[9837]: time="2019-05-31T18:15:22.944748472Z" level=info msg="[   41.336969] systemd[1]: run-libcontainer-77def99ec
899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb.mount: Collecting.\n" name=kata-proxy pid=9837 sandbox=77def99ec899addc7a943b30dc3643dcadee1ea416acb
4c3836e91d6833315bb source=agent
May 31 18:15:22 virtiofs-tests kata-proxy[9837]: time="2019-05-31T18:15:22.945411479Z" level=info msg="[   41.337324] systemd[1]: run-libcontainer-77def99ec
899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb-a02e080a289b62030927d519256cac214b99e45839eec6aa0064c0c47691d298.mount: Collecting.\n" name=kata-pro
xy pid=9837 sandbox=77def99ec899addc7a943b30dc3643dcadee1ea416acb4c3836e91d6833315bb source=agent
May 31 18:19:22 virtiofs-tests crio[9590]: time="2019-05-31 18:19:22.718811071Z" level=error msg="Container creation timeout (4m0s)"

any idea?

@egernst
Copy link
Member

egernst commented Jun 11, 2019

@chavafg can we just use the nemu hypervisor with its default toml now? Is this PR still needed?

@chavafg
Copy link
Contributor Author

chavafg commented Jun 11, 2019

I need to change the PR to now use the configuration-nemu.toml instead of the default one.

@chavafg chavafg added the do-not-merge PR has problems or depends on another label Jun 11, 2019
@ganeshmaharaj
Copy link

ganeshmaharaj commented Jun 18, 2019

@ganeshmaharaj now the tests are getting hanged on the cri-o tests when trying to create a container inside a pod. I just see timeout error on the kata logs, but the lasts messages are from systemd on kata-proxy log.

I tried to reproduce locally, these are the logs:

May 31 18:15:22 virtiofs-tests kata-proxy[9837]: time="2019-05-31T18:15:22.924103968Z" level=info msg="time=\"2019-05-31T18:15:22.907783871Z\" level=debug m
sg=\"updating cpuset cgroup\" debug_console=false name=kata-agent path=/sys/fs/cgroup/cpuset/Burstable/pod_123-456/crio-77def99ec899addc7a943b30dc3643dcadee
<snip>
May 31 18:19:22 virtiofs-tests crio[9590]: time="2019-05-31 18:19:22.718811071Z" level=error msg="Container creation timeout (4m0s)"

any idea?

I seem to have missed it. is this still an issue? I just tested and it seems to be working fine. and we have fixed the hotplug issue. Once the patch lands in runtime, we can turn that test back on as well.

@chavafg
Copy link
Contributor Author

chavafg commented Jun 25, 2019

/test-nemu

The kata-containers configuration file for nemu already
uses virtiofs as default. Use this config file to run
the CI with nemu and virtiofs.

In addition, this change also skips memory related tests
as kata-containers/runtime#1745
is still open.

Fixes: kata-containers#1536.

Signed-off-by: Salvador Fuentes <[email protected]>
we still have some issues running memory hotplug using
virtiofs, so many of the cri-o and k8s tests do not run
as expected.
This should be resolved on kata-containers/runtime#1810,
but in the meantime run only docker related tests.

Signed-off-by: Salvador Fuentes <[email protected]>
@chavafg chavafg removed the do-not-merge PR has problems or depends on another label Jun 25, 2019
@chavafg
Copy link
Contributor Author

chavafg commented Jun 25, 2019

/test

@jodh-intel
Copy link
Contributor

Greeeeeeeeeen!

@jodh-intel jodh-intel merged commit 86d2327 into kata-containers:master Jun 26, 2019
@chavafg chavafg deleted the topic/virtiofs-support branch July 17, 2019 19:48
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add virtiofs support for CI
7 participants