Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

Stable 1.7 1837 #1839

Closed
wants to merge 33 commits into from
Closed

Stable 1.7 1837 #1839

wants to merge 33 commits into from

Conversation

egernst
Copy link
Member

@egernst egernst commented Jul 1, 2019

No description provided.

chavafg and others added 30 commits June 3, 2019 13:27
We need to build kata-runtime to have the correct files
in place to be able to run the static checks script.

Fixes #1716.

Signed-off-by: Salvador Fuentes <[email protected]>
(cherry picked from commit e8bf810)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
Here we have done with logger and container ID map
Just delete these code.
fixes #1740

Signed-off-by: Haomin Tsai <[email protected]>
(cherry picked from commit bdae295)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
There is an issue that ctrl-c stop vmcache server will stop all
containers that its VM is created by it.
The cause is kata-proxy and vmcache server use same tty, for example:
ps -e | grep kata
3617 pts/5    00:00:00 kata-runtime
3636 pts/5    00:00:00 kata-proxy
Ctrl-c will send signal to both kata-proxy and vmcache server.
Then the containers that its VM is created by this vmcache server will
quit with it.

Set Setsid to true when exec kata-proxy to handle this issue.

Fixes: #1726

Signed-off-by: Hui Zhu <[email protected]>
(cherry picked from commit 19115ef)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
After previous commit, found that kata-proxy is not quit
when vmcache server is stopped by ctrl-c.
The cause is current kata-proxy is setsid when it exec.  It will
not get the signal ctrl-c.

Call vm.Disconnect() when close vm in cache factory to handle
this issue.

Fixes: #1726

Signed-off-by: Hui Zhu <[email protected]>
(cherry picked from commit 7bf6c67)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
The rootfs image was fixed, now the DAX metadata and 2 MBRs headers are part
of the same image. Mounting the rootfs partiton with an offset of 2M is no
more needed, since the first MBR is read by partx or losetup by default.

fixes #1443

Signed-off-by: Julio Montes [email protected]
(cherry picked from commit 82e51d4)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
It should pass the container id instead of sandbox id.

Fixes:#1672

Signed-off-by: lifupan <[email protected]>
(cherry picked from commit 5e1f5ca)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
According to CRI specs, kubelet will call StopPodSandbox()
at least once before calling RemovePodSandbox, and this call
is idempotent, and must not return an error if all relevant
resources have already been reclaimed. And in that call it will
send a SIGKILL signal first to try to stop the container, thus
once the container has terminated, here should ignore this signal
and return directly.

Fixes:#1672

Signed-off-by: lifupan <[email protected]>
(cherry picked from commit 0d535f5)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
Kubelet would cleanup the pod cgroup resources and kill the processes
in the pod cgroups when it detected all of the containers in a pod exited,
thus shimv2 should close the hypervisor process once the podsandbox container
exited, otherwise, the hypervisor process would be killed by kubelet and
made shimv2 failed to shutdown the sandbox.

Fixes:#1672

Signed-off-by: lifupan <[email protected]>
(cherry picked from commit f301c95)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
```
//the network namespace created by cni plugin
netns, err = namespaces.NamespaceRequired(ctx)
if err != nil {
        return nil, errors.Wrap(err, "create namespace")
}
```

the netns is a containerd namespace concept, it not netns, event a cni
set netns for this, this is a tricky way, so remove the logic.

Fixes: #1692

Signed-off-by: Ace-Tang <[email protected]>
(cherry picked from commit d6b3bff)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
Use `kata-containers.runtime` that is the runtime binary, to
collect the data if the kata-runtime binary is not installed

fixes #1720

Signed-off-by: Julio Montes <[email protected]>
(cherry picked from commit 19288aa)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
virtio-fs is now available in 1.7 release and needs hugepages enabled.
Updating version of NEMU that ships with kata by default which contains
the fixes for hugepages, machine_type=virt and network access.

Fixes: #1709
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
(cherry picked from commit 722ac5a)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
nemu needs to be configured with:
`machine_type = "virt"` by default.

In addition, this commit removes
`machine_accelerators="virt"` which was added instead
of `machine_type` in a previous commit.

Fixes: #1707.

Signed-off-by: Salvador Fuentes <[email protected]>
(cherry picked from commit 6be5e5f)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
Fix the test case TestGetShmSizeBindMounted by
setting the right ShmSize for ppc64le.

Fixes: #1702

Signed-off-by: Nitesh Konkar [email protected]
(cherry picked from commit 1789b65)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
Now that CRI-O released a new version we can update it.

Fixes #1696

Signed-off-by: Gabriela Cervantes <[email protected]>
(cherry picked from commit 5d527d7)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
Set the minimum golang version to 1.11.10, the latest stable 1.11 version
at the time of writing. Go 1.11 is required to build the agent with working
vsock support.

Fixes: #1693

Signed-off-by: Marco Vedovati <[email protected]>
(cherry picked from commit c22b15d)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
To help trace virtiofsd issues.

Signed-off-by: Peng Tao <[email protected]>
(cherry picked from commit d0aae80)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
If virtiofsd fails to initialize and stops unexpected,
qemu might hang forever. We just stop the qemu process.
Resource cleanup will be done by others.

Fixes: #1690
Signed-off-by: Peng Tao <[email protected]>
(cherry picked from commit 89e0dfa)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
Got a defunct kata-proxy after kata quit when VMCache is enabled.
The reason is vmcache server opens kata-proxy but doesn't wait it.

If VMCache is disabled, kata-runtime will quit before kata-proxy.
So it will not meet the issue.

Open a special goroutine do cmd.Wait in kataProxy.start to handle
the isssue.

Fixes: #1678

Signed-off-by: Hui Zhu <[email protected]>
(cherry picked from commit 00d03c1)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
Fixes: #1673

Signed-off-by: Zha Bin <[email protected]>
(cherry picked from commit bdb1047)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
If kata containers is using vfio and vhost net,the unbinding
of vfio would be hang. In the scenario, vhost net kernel thread
takes a reference to the qemu's mm, and the reference also includes
the mmap regions on the vfio device file. so vhost kernel thread
would be not released when qemu is killed as the vhost file
descriptor still is opened by shim v2 process, and the vfio device
is not released because there's still a reference to the mmap.

Fixes: #1669

Signed-off-by: Yang, Wei <[email protected]>
Signed-off-by: Eric Ernst <[email protected]>
(cherry picked from commit 071030b)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
proxy will never be use with the Firecracker VMM. Keeping this header
will result in runtime failures, since the configuration will be parsed
on the path searched for.

Since vsock will always be used, remove the proxy section.

Fixes: #1761

Signed-off-by: Eric Ernst <[email protected]>
(cherry picked from commit bbe5584)
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
- Backports for 1.7.1

a480f27 fc-toml: remove proxy section in config
b798c28 shimv2: Close vhostfd after vm get vhostfd
8c199e2 network: delete IP addrs on bridge model to prevent ARP conflict
7c7da54 kata_proxy: Open a special goroutine do cmd.Wait
fb2a995 qemu: stop qemu process when virtiofsd quits
52f0193 qemu: print virtiofsd logs when debug is on
0199d89 versions: Update golang to 1.11.10
58f7eea versions: Update CRI-O version to 1.14.1
feddee0 virtcontainers: Set correct Shmsize for ppc64le
a268c66 nemu-config: Add machine_type to config file
97cf3c9 nemu-config: fix nemu for ci
2c444f3 data/kata-collect-data: support kata containers snap
a0c413a shimv2: remove use containerd ns as netns
9661586 shimv2: shutdown the sandbox when sandbox container exited
eb75d0c shimv2: kill a container return directly once the container termianted
a98871e shimv2: fix the issue of passing the wrong container id
ad4b07d data: Revert pull request #1405
5eecdae cache: Call vm.Disconnect() when close vm
6434414 kata_proxy: Set Setsid to true when exec kata-proxy
3cb6316 runtime : delete redundant code in CreateContainer
0a46998 ci: Build kata-runtime before running static checks

Signed-off-by: katacontainersbot <[email protected]>
hugepages were enbled by default on NEMU to allow use of virtio-fs. kata
now has a change where virtio-fs will default to use /dev/shm as the
shared memory file backing location. With that, we should be able to
disable default hugepages for NEMU

Fixes: #1775
Signed-off-by: Ganesh Maharaj Mahalingam <[email protected]>
(cherry picked from commit a75db86)
NEMU: Disable default hugepages enabling for virtio-fs
detail how kata work with libnetwork
1. kata create a new netns
2. with EnterNS, kata change netns to the created one.
3. in pre-start hook, kata will re-exec libnetwork process
libnetwork-setkey, and send self pid to it. libnetwork use
/proc/pid/ns/net to find the netns kata use, and set veth into the netns.

v1/v2 shim use the same way to create network, v1 can successful
because EnterNS changed both current thread and main thread's netns.
But use v2 shim, only changed current thread netns, main thread still
use host netns, so it fails. Looks like v1 just lucky to be successful.
In kata, `state.Pid` should be tid.

Fixes: #1788

Signed-off-by: Ace-Tang <[email protected]>
[cherry-pick 1.7]: katautils: fix shim v2 fail to work with libnetwork
- [cherry-pick 1.7]: katautils: fix shim v2 fail to work with libnetwork
- NEMU: Disable default hugepages enabling for virtio-fs

0066fa5 katautils: fix shim v2 fail to work with libnetwork
8cf29dd NEMU: Disable default hugepages enabling for virtio-fs

Signed-off-by: Peng Tao <[email protected]>
Eric Ernst and others added 3 commits June 22, 2019 20:53
Update to newer stable kernel

Fixes: #1816

Signed-off-by: Eric Ernst <[email protected]>
versions: update kernel to 4.19.52
Kubernetes moved CRI document within the sig-node directory. Updating
README.md accordingly.

Fixes: 1837

Signed-off-by: Eric Ernst <[email protected]>
@egernst egernst requested a review from a team as a code owner July 1, 2019 19:37
@egernst egernst closed this Jul 1, 2019
@egernst egernst deleted the stable-1.7-1837 branch September 17, 2019 15:29
egernst pushed a commit to egernst/runtime that referenced this pull request Feb 9, 2021
This updates grpc-go vendor package to v1.11.3 release, to fix server.Stop()
handling so that server.Serve() does not wait blindly.

Full commit list:
d11072e (tag: v1.11.3) Change version to 1.11.3
d06e756 clientconn: add support for unix network in DialContext. (kata-containers#1883)
452c2a7 Change version to 1.11.3-dev
d89cded (tag: v1.11.2) Change version to 1.11.2
98ac976 server: add grpc.Method function for extracting method from context (kata-containers#1961)
0f5fa28 Change version to 1.11.2-dev
1e2570b (tag: v1.11.1) Change version to 1.11.1
d28faca client: Fix race when using both client-side default CallOptions and per-call CallOptions (kata-containers#1948)
48b7669 Change version to 1.11.1-dev
afc05b9 (tag: v1.11.0) Change version to 1.11.0
f2620c3 resolver: keep full unparsed target string if scheme in parsed target is not registered (kata-containers#1943)
9d2250f status: rename Status to GRPCStatus to avoid name conflicts (kata-containers#1944)
2756956 status: Allow external packages to produce status-compatible errors (kata-containers#1927)
0ff1b76 routeguide: reimplement distance calculation
dfbefc6 service reflection can lookup enum, enum val, oneof, and field symbols (kata-containers#1910)
32d9ffa Documentation: Fix broken link in rpc-errors.md (kata-containers#1935)
d5126f9 Correct Go 1.6 support policy (kata-containers#1934)
5415d18 Add documentation and example of adding details to errors (kata-containers#1915)
57640c0 Allow storing alternate transport.ServerStream implementations in context (kata-containers#1904)
031ee13 Fix Test: Update the deadline since small deadlines are prone to flakes on Travis. (kata-containers#1932)
2249df6 gzip: Add ability to set compression level (kata-containers#1891)
8124abf credentials/alts: Remove the enable_untrusted_alts flag (kata-containers#1931)
b96718f metadata: Fix bug where AppendToOutgoingContext could modify another context's metadata (kata-containers#1930)
738eb6b fix minor typos and remove grpc.Codec related code in TestInterceptorCanAccessCallOptions (kata-containers#1929)
211a7b7 credentials/alts: Update ALTS "New" APIs (kata-containers#1921)
fa28bef client: export types implementing CallOptions for access by interceptors (kata-containers#1902)
ec9275b travis: add Go 1.10 and run vet there instead of 1.9 (kata-containers#1913)
13975c0 stream: split per-attempt data from clientStream (kata-containers#1900)
2c2d834 stats: add BeginTime to stats.End (kata-containers#1907)
3a9e1ba Reset ping strike counter right before sending out data. (kata-containers#1905)
90dca43 resolver: always fall back to default resolver when target does not follow URI scheme (kata-containers#1889)
9aba044 server: Convert all non-status errors to codes.Unknown (kata-containers#1881)
efcc755 credentials/alts: change ALTS protos to match the golden version (kata-containers#1908)
0843fd0 credentials/alts: fix infinite recursion bug [in custom error type] (kata-containers#1906)
207e276 Fix test race: Atomically access minConnecTimout in testing environment. (kata-containers#1897)
3ae2a61 interop: Add use_alts flag to client and server binaries (kata-containers#1896)
5190b06 ALTS: Simplify "New" APIs (kata-containers#1895)
7c5299d Fix flaky test: TestCloseConnectionWhenServerPrefaceNotReceived (kata-containers#1870)
f0a1202 examples: Replace context.Background with context.WithTimeout (kata-containers#1877)
a1de3b2 alts: Change ALTS proto package name (kata-containers#1886)
2e7e633 Add ALTS code (kata-containers#1865)
583a630 Expunge error codes that shouldn't be returned from library (kata-containers#1875)
2759199 Small spelling fixes (unknow -> unknown) (kata-containers#1868)
12da026 clientconn: fix a typo in GetMethodConfig documentation (kata-containers#1867)
dfa1834 Change version to 1.11.0-dev (kata-containers#1863)
46fd263 benchmarks: add flag to benchmain to use bufconn instead of network (kata-containers#1837)
3926816 addrConn: Report underlying connection error in RPC error (kata-containers#1855)
445b728 Fix data race in TestServerGoAwayPendingRPC (kata-containers#1862)
e014063 addrConn: keep retrying even on non-temporary errors (kata-containers#1856)
484b3eb transport: fix race causing flow control discrepancy when sending messages over server limit (kata-containers#1859)
6c48c7f interop test: Expect io.EOF from stream.Send() (kata-containers#1858)
08d6261 metadata: provide AppendToOutgoingContext interface (kata-containers#1794)
d50734d Add status.Convert convenience function (kata-containers#1848)
365770f streams: Stop cleaning up after orphaned streams (kata-containers#1854)
7646b53 transport: support stats.Handler in serverHandlerTransport (kata-containers#1840)
104054a Fix connection drain error message (kata-containers#1844)
d09ec43 Implement unary functionality using streams (kata-containers#1835)
37346e3 Revert "Add WithResolverUserOptions for custom resolver build options" (kata-containers#1839)
424e3e9 Stream: do not cancel ctx created with service config timeout (kata-containers#1838)
f9628db Fix lint error and typo (kata-containers#1843)
0bd008f stats: Fix bug causing trailers-only responses to be reported as headers (kata-containers#1817)
5769e02 transport: remove unnecessary rstReceived (kata-containers#1834)
0848a09 transport: remove redundant check of stream state in Write (kata-containers#1833)
c22018a client: send RST_STREAM on client-side errors to prevent server from blocking (kata-containers#1823)
82e9f61 Use keyed fields for struct initializers (kata-containers#1829)
5ba054b encoding: Introduce new method for registering and choosing codecs (kata-containers#1813)
4f7a2c7 compare atomic and mutex performance in case of contention. (kata-containers#1788)
b71aced transport: Fix a data race when headers are received while the stream is being closed (kata-containers#1814)
46bef23 Write should fail when the stream was done but context wasn't cancelled. (kata-containers#1792)
10598f3 Explain target format in DialContext's documentation (kata-containers#1785)
08b7bd3 gzip: add Name const to avoid typos in usage (kata-containers#1804)
8b02d69 remove .please-update (kata-containers#1800)
1cd2346 Documentation: update broken wire.html link in metadata package. (kata-containers#1791)
6913ad5 Document that all errors from RPCs are status errors (kata-containers#1782)
8a8ac82 update const order (kata-containers#1770)
e975017 Don't set reconnect parameters when the server has already responded. (kata-containers#1779)
7aea499 credentials: return Unavailable instead of Internal for per-RPC creds errors (kata-containers#1776)
c998149 Avoid copying headers/trailers in unary RPCs unless requested by CallOptions (kata-containers#1775)
8246210 Update version to 1.10.0-dev (kata-containers#1777)
17c6e90 compare atomic and mutex performance for incrementing/storing one variable (kata-containers#1757)
65c901e Fix flakey test. (kata-containers#1771)
7f2472b grpclb: Remove duplicate init() (kata-containers#1764)
09fc336 server: fix bug preventing Serve from exiting when Listener is closed (kata-containers#1765)
035eb47 Fix TestGracefulStop flakiness (kata-containers#1767)
2720857 server: fix race between GracefulStop and new incoming connections (kata-containers#1745)
0547980 Notify parent ClientConn to re-resolve in grpclb (kata-containers#1699)
e6549e6 Add dial option to set balancer (kata-containers#1697)
6610f9a Fix test: Data race while resetting global var. (kata-containers#1748)
f4b5237 status: add Code convenience function (kata-containers#1754)
47bddd7 vet: run golint on _string files (kata-containers#1749)
45088c2 examples: fix concurrent map accesses in route_guide server (kata-containers#1752)
4e393e0 grpc: fix deprecation comments to conform to standard (kata-containers#1691)
0b24825 Adjust keepalive paramenters in the test such that scheduling delays don't cause false failures too often. (kata-containers#1730)
f9390a7 fix typo (kata-containers#1746)
6ef45d3 fix stats flaky test (kata-containers#1740)
98b17f2 relocate check for shutdown in ac.tearDown() (kata-containers#1723)
5ff10c3 fix flaky TestPickfirstOneAddressRemoval (kata-containers#1731)
2625f03 bufconn: allow readers to receive data after writers close (kata-containers#1739)
b0e0950 After sending second goaway close conn if idle. (kata-containers#1736)
b8cf13e Make sure all goroutines have ended before restoring global vars. (kata-containers#1732)
4742c42 client: fix race between server response and stream context cancellation (kata-containers#1729)
8fba5fc In gracefull stop close server transport only after flushing status of the last stream. (kata-containers#1734)
d1fc8fa Deflake tests that rely on Stop() then Dial() not reconnecting (kata-containers#1728)
dba60db Switch balancer to grpclb when at least one address is grpclb address (kata-containers#1692)
ca1b23b Update CONTRIBUTING.md to CNCF CLA
2941ee1 codes: Add UnmarshalJSON support to Code type (kata-containers#1720)
ec61302 naming: Fix build constraints for go1.6 and go1.7 (kata-containers#1718)
b8191e5 remove stringer and go generate (kata-containers#1715)
ff1be3f Add WithResolverUserOptions for custom resolver build options (kata-containers#1711)
580defa Fix grpc basics link in route_guide example (kata-containers#1713)
b7dc71e Optimize codes.String() method using a switch instead of a slice of indexes (kata-containers#1712)
1fc873d Disable ccBalancerWrapper when it is closed (kata-containers#1698)
bf35f1b Refactor roundrobin to support custom picker (kata-containers#1707)
4308342 Change parseTimeout to not handle non-second durations (kata-containers#1706)
be07790 make load balancing policy name string case-insensitive (kata-containers#1708)
cd563b8 protoCodec: avoid buffer allocations if proto.Marshaler/Unmarshaler (kata-containers#1689)
61c6740 Add comments to ClientConn/SubConn interfaces to indicate new methods may be added (kata-containers#1680)
ddbb27e client: backoff before reconnecting if an HTTP2 server preface was not received (kata-containers#1648)
a4bf341 use the request context with net/http handler (kata-containers#1696)
c6b4608 transport: fix race sending RPC status that could lead to a panic (kata-containers#1687)
00383af Fix misleading default resolver scheme comments (kata-containers#1703)
a62701e Eliminate data race in ccBalancerWrapper (kata-containers#1688)
1e1a47f Re-resolve target when one connection becomes TransientFailure (kata-containers#1679)
2ef021f New grpclb implementation (kata-containers#1558)
10873b3 Fix panics on balancer and resolver updates (kata-containers#1684)
646f701 Change version to 1.9.0-dev (kata-containers#1682)

Fixes: kata-containers#307

Signed-off-by: Peng Tao <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.