Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

VMCache: the new function that creates VMs as caches before using it #1166

Merged
merged 3 commits into from
Mar 8, 2019

Conversation

teawater
Copy link
Member

VM cache helps speeding up new container creation.
To use it, need set option "enable_vm_cache" to true and use
"kata-runtime cache" command start the VM cache server that created
some VMs as VM cache. Then each kata-runtime will request VM from
VM cache server.

Currently, VM cache still cannot work with VM templating and vsock.
And just support qemu.

Fixes: #52

Signed-off-by: Hui Zhu [email protected]

cli/cache.go Outdated Show resolved Hide resolved
cli/cache.go Outdated Show resolved Hide resolved
cli/cache.go Outdated Show resolved Hide resolved
cli/cache.go Outdated Show resolved Hide resolved
cli/cache.go Outdated Show resolved Hide resolved
@teawater teawater force-pushed the vm_cache branch 5 times, most recently from 2b8d953 to 1e7bb03 Compare January 25, 2019 09:44
cli/cache.go Outdated Show resolved Hide resolved
cli/cache.go Outdated Show resolved Hide resolved
cli/cache.go Outdated Show resolved Hide resolved
cli/cache.go Outdated Show resolved Hide resolved
@teawater teawater force-pushed the vm_cache branch 3 times, most recently from e8380e3 to 151ab6d Compare January 30, 2019 02:40
@teawater
Copy link
Member Author

/test

@teawater
Copy link
Member Author

/test

@teawater teawater force-pushed the vm_cache branch 4 times, most recently from 7c9e5a5 to 13ff7e3 Compare January 30, 2019 14:05
@teawater
Copy link
Member Author

/test

@raravena80
Copy link
Member

@teawater ping, any updates? Thx!

@teawater
Copy link
Member Author

/test

@teawater
Copy link
Member Author

@raravena80 I am fixing the test issue after Chinese new year.

@teawater teawater force-pushed the vm_cache branch 3 times, most recently from 819a0b3 to 41e2688 Compare February 10, 2019 15:45
@teawater
Copy link
Member Author

/test

@teawater
Copy link
Member Author

jenkins-ci-ARM-ubuntu-18-04 fail relate with #1202.
jenkins-ci-ubuntu-16-04-firecracker fail relate with #1221.
So I think these 2 fails do not relate with this pr.

@teawater
Copy link
Member Author

@amshinde @jodh-intel I pushed new comments according to your comments. Please help me review it.

@teawater teawater force-pushed the vm_cache branch 2 times, most recently from 48df99c to 5b351c5 Compare February 12, 2019 10:34
@teawater
Copy link
Member Author

/test

@teawater teawater requested a review from bergwolf February 25, 2019 07:27
@teawater
Copy link
Member Author

jenkins-metrics-ubuntu-16-04 failed:

Report Summary:
+-----+----------------------+-------+--------+--------+-------+--------+--------+------+------+-----+
| P/F |         NAME         |  FLR  |  MEAN  |  CEIL  |  GAP  |  MIN   |  MAX   | RNG  | COV  | ITS |
+-----+----------------------+-------+--------+--------+-------+--------+--------+------+------+-----+
| *F* | boot-times           | 94.1% | 127.5% | 105.9% | 11.9% | 126.4% | 129.1% | 2.1% | 0.6% |  20 |
| *F* | memory-footprint     | 95.0% | 110.5% | 105.0% | 10.0% | 110.5% | 110.5% | 0.0% | 0.0% |   1 |
| P   | memory-footprint-ksm | 95.0% | 103.3% | 105.0% | 10.0% | 103.3% | 103.3% | 0.0% | 0.0% |   1 |
+-----+----------------------+-------+--------+--------+-------+--------+--------+------+------+-----+
Fails: 2, Passes 1
Failed
checkmetrics FAILED (1)

This pr doesn't change the default behavior of kata-runtime.
So I think this fail does not relate with this fail.

@grahamwhaley
Copy link
Contributor

@teawater - ah, yes, since the 4.19 kernel update landed, the metrics will need adjusting. I had to leave it a few days to gather data so I can reset the bounds checks - let me try to update those today...
In the meantime, ignore the metrics CI....

Copy link
Member

@bergwolf bergwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @teawater !

return errors.New("VM cache just support kata agent")
}
if config.HypervisorConfig.UseVSock {
return errors.New("config vsock conflicts with VM cache, please disable one of them")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not really conflict. Need to send the guest vsock cid/port over cache server grpc. Maybe a todo in future PR.

Copy link
Contributor

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @teawater.

lgtm

Is there an issue in 'tests' to create CI tests for this feature?

cli/factory.go Outdated
if err != nil {
return nil, err
}
if err = os.Chmod(path, 0660); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be 660? Could we use 600 instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

And I opened an issue for needing CI test for vmcache in kata-containers/tests#1259

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @teawater !

@teawater
Copy link
Member Author

teawater commented Mar 1, 2019

/test

@teawater
Copy link
Member Author

teawater commented Mar 2, 2019

jenkins-ci-ARM-ubuntu-18-04 fail:

INFO: creating OCI bundle in /tmp/kata-runtime-944448708/bundle for tests to use
time="2019-03-02T10:14:34+08:00" level=debug msg="converting /tmp/kata-runtime-944448708/bundle/config.json" source=virtcontainers subsystem=oci
=== RUN   TestConsoleFromFile
--- PASS: TestConsoleFromFile (0.00s)
=== RUN   TestNewConsole
--- FAIL: TestNewConsole (0.00s)
    assertions.go:239: 
                          
	Error Trace:	console_test.go:28
        
	Error:      	Received unexpected error:
        
	            	open /dev/ptmx: no such device
        
	Messages:   	failed to create a new console: open /dev/ptmx: no such device
panic: runtime error: invalid memory address or nil pointer dereference
	panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0xacb5fc]

goroutine 101 [running]:
testing.tRunner.func1(0x40003ce000)
	/usr/local/go/src/testing/testing.go:792 +0x30c
panic(0xc2e900, 0x1560890)
	/usr/local/go/src/runtime/panic.go:513 +0x18c
github.com/kata-containers/runtime/cli.(*Console).Close(0x0, 0x4000344f80, 0x2)
	/home/jenkins/workspace/kata-containers-runtime-ARM-18.04-PR/go/src/github.com/kata-containers/runtime/cli/console.go:88 +0x44
panic(0xc2e900, 0x1560890)
	/usr/local/go/src/runtime/panic.go:513 +0x18c
github.com/kata-containers/runtime/cli.(*Console).Path(...)
	/home/jenkins/workspace/kata-containers-runtime-ARM-18.04-PR/go/src/github.com/kata-containers/runtime/cli/console.go:73
github.com/kata-containers/runtime/cli.TestNewConsole(0x40003ce000)
	/home/jenkins/workspace/kata-containers-runtime-ARM-18.04-PR/go/src/github.com/kata-containers/runtime/cli/console_test.go:31 +0xf0
testing.tRunner(0x40003ce000, 0xd89c98)
	/usr/local/go/src/testing/testing.go:827 +0xa8
created by testing.(*T).Run
	/usr/local/go/src/testing/testing.go:878 +0x2b0
FAIL	github.com/kata-containers/runtime/cli	1.682s
Makefile:532: recipe for target 'go-test' failed
make: *** [go-test] Error 1
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for :.* : True
Logical operation result is TRUE
Running script  : #!/bin/bash

This PR doesn't change the default behavior of kata-runtime execution. So it has nothing to do with this failure.

@teawater
Copy link
Member Author

teawater commented Mar 3, 2019

@sboeuf please review the new version. Thanks.

@sboeuf
Copy link

sboeuf commented Mar 5, 2019

@teawater I will review this today!

Copy link

@sboeuf sboeuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@teawater I have one comment regarding the vendoring.

Also, the code looks fine to me, but I thought you would add some documentation. Have you created an *.md file somewhere? Maybe the documentation repo?

We need to make sure documentation gets merged along with this PR.

@@ -74,6 +74,10 @@
name = "github.com/firecracker-microvm/firecracker-go-sdk"
revision = "961461227bddf7e40a1d690634e866c343910f86"

[[constraint]]
name = "github.com/gogo/protobuf"
revision = "342cbe0a04158f6dcb03ca0079991a51a4248c02"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that after you add the constraint and run dep ensure -update, you would have a modified Gopkg.lock that also needs to be committed here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that after you add the constraint and run dep ensure -update, you would have a modified Gopkg.lock that also needs to be committed here.

It is already in

name = "github.com/gogo/protobuf"

because it's a sub-dependency.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I get that, but I was expecting that since you're modifying the Gopkg.toml, running dep ensure -update would at least modify the digest of the Gopkg.lock:

[[projects]]
  digest = "1:18108594151654e9e696b27b181b953f9a90b16bf14d253dd1b397b025a1487f"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I get that, but I was expecting that since you're modifying the Gopkg.toml, running dep ensure -update would at least modify the digest of the Gopkg.lock:

[[projects]]
  digest = "1:18108594151654e9e696b27b181b953f9a90b16bf14d253dd1b397b025a1487f"

Done.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@gnawux gnawux requested a review from sboeuf March 6, 2019 02:43
@teawater
Copy link
Member Author

teawater commented Mar 7, 2019

@teawater I have one comment regarding the vendoring.

Also, the code looks fine to me, but I thought you would add some documentation. Have you created an *.md file somewhere? Maybe the documentation repo?

We need to make sure documentation gets merged along with this PR.

Add a pr kata-containers/documentation#393 for vmcache.

@sboeuf
Copy link

sboeuf commented Mar 7, 2019

Thanks for opening kata-containers/documentation#393 @teawater 👍

teawater added 2 commits March 8, 2019 10:05
VMCache is a new function that creates VMs as caches before using it.
It helps speed up new container creation.
The function consists of a server and some clients communicating
through Unix socket.  The protocol is gRPC in protocols/cache/cache.proto.
The VMCache server will create some VMs and cache them by factory cache.
It will convert the VM to gRPC format and transport it when gets
requestion from clients.
Factory grpccache is the VMCache client.  It will request gRPC format
VM and convert it back to a VM.  If VMCache function is enabled,
kata-runtime will request VM from factory grpccache when it creates
a new sandbox.

VMCache has two options.
vm_cache_number specifies the number of caches of VMCache:
unspecified or == 0   --> VMCache is disabled
> 0                   --> will be set to the specified number
vm_cache_endpoint specifies the address of the Unix socket.

This commit just includes the core and the client of VMCache.

Currently, VM cache still cannot work with VM templating and vsock.
And just support qemu.

Fixes: kata-containers#52

Signed-off-by: Hui Zhu <[email protected]>
When VMCache is enabled, factory init will run as a VMcache server.

Fixes: kata-containers#52

Signed-off-by: Hui Zhu <[email protected]>
@teawater teawater force-pushed the vm_cache branch 2 times, most recently from 4e51ca5 to 39272d3 Compare March 8, 2019 03:35
@teawater
Copy link
Member Author

teawater commented Mar 8, 2019

/test

VMCache code use github.com/gogo/protobuf.

Fixes: kata-containers#52

Signed-off-by: Hui Zhu <[email protected]>
@teawater
Copy link
Member Author

teawater commented Mar 8, 2019

/test

@sboeuf sboeuf merged commit 80cdf89 into kata-containers:master Mar 8, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants