Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

virtcontainers: Set ppc64le maxmem depending on qemu version #417

Merged
merged 1 commit into from
Jul 9, 2018
Merged

Conversation

nitkon
Copy link
Contributor

@nitkon nitkon commented Jun 19, 2018

The "Failed to allocate HTAB of requested size,
try with smaller maxmem" error in ppc64le occurs
when maxmem allocated is very high. This got fixed
in qemu 2.10 and kernel 4.11. Hence put a maxmem
restriction of 32GB per kata-container if qemu
version less than 2.10

Fixes: #415

Signed-off-by: Nitesh Konkar [email protected]

@@ -64,6 +64,9 @@ const qmpCapErrMsg = "Failed to negoatiate QMP capabilities"

const defaultConsole = "console.sock"

var qemuMajorVersion int
var qemuMinorVersion int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we want to store these in one of the types, even though they are only being used for PPC64le atm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jodh-intel : Should we? I am not sure if its gonna get consumed by any other architectures atm.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would definitely include those as part of the qemu structure since this is an info about the hypervisor. And I agree with @jodh-intel that even if it's only used by ppc64, this is not a reason to define some global variable there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sboeuf : Okay, I shall add qemuMajorVersion and qemuMinorVersion into the qemu structure. However, I am unsure of what would be the most efficient way of exposing it in func (q *qemuPPC64le) memoryTopology. One way is to get qemu version info in func (q *qemu) init(sandbox *Sandbox) error and pass via newQemuArch resulting in change of signature of newQemuArch everywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit fiddly, but we could add them to qemuArchBase like this:

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nitkon oh yeah my bad, I forgot qemu structure was not the interface related to qemu_ppc64le. As @jodh-intel suggested, qemuArchBase should be the one.

@jodh-intel, I have taken a look at your suggestion but I thought we could simplify this by simply having those fields being called from qemu_ppc64le structure (no need for getters and setters). Isn't it possible since this structure inherits from qemuArchBase ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what you're saying tbh. I'm hoping there is a simpler way though as I'm not particularly happy with what I came up with :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jodh-intel : So I was going through the implementation but noticed the following.
Printing q.qemuMajorVersion, q.qemuMinorVersion in "func (q qemuArchBase) getQEMUVersion() (major, minor int) {" shows me the following: level=info msg="DEBUG: versions: major: 0, minor: 0" arch=amd64 name=kata-runtime pid=6215 .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jodh-intel : I am a bit confused by the reference implementation. We are setting qemuMajorVersion and qemuMinorVersion using one qemu instance and getting it with another qemu instance. (resulting it in 0,0 values).

IIUC, we need to set/get using the same qemu instance

if hostMemoryMb > defaultMemMaxPPC64le {
hostMemoryMb = defaultMemMaxPPC64le
} else {
if qemuMajorVersion >= 2 && qemuMinorVersion >= 10 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider adding a couple of logging calls here, one for each scenario so it's clear what's happening from the log.

What we cannot reliably check the guest kernel version so maybe your debug entry for handling qemu 2.10+ should mention the 4.11 guest kernel assumption.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jodh-intel : Done. Updated my PR.

@katacontainersbot
Copy link
Contributor

PSS Measurement:
Qemu: 142299 KB
Proxy: 4700 KB
Shim: 9131 KB

Memory inside container:
Total Memory: 2045972 KB
Free Memory: 2007432 KB

The "Failed to allocate HTAB of requested size,
try with smaller maxmem" error in ppc64le occurs
when maxmem allocated is very high. This got fixed
in qemu 2.10 and kernel 4.11. Hence put a maxmem
restriction of 32GB per kata-container if qemu
version less than 2.10

Fixes: #415

Signed-off-by: Nitesh Konkar <[email protected]>
@codecov
Copy link

codecov bot commented Jun 19, 2018

Codecov Report

Merging #417 into master will decrease coverage by 0.03%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #417      +/-   ##
==========================================
- Coverage   63.79%   63.76%   -0.04%     
==========================================
  Files          87       87              
  Lines        8811     8825      +14     
==========================================
+ Hits         5621     5627       +6     
- Misses       2587     2594       +7     
- Partials      603      604       +1
Impacted Files Coverage Δ
virtcontainers/qemu.go 19.27% <0%> (-0.08%) ⬇️
virtcontainers/sandbox.go 66.83% <0%> (-0.23%) ⬇️
virtcontainers/container.go 46.59% <0%> (-0.1%) ⬇️
cli/run.go 88.23% <0%> (ø) ⬆️
cli/create.go 79.39% <0%> (ø) ⬆️
virtcontainers/kata_agent.go 30.32% <0%> (+0.07%) ⬆️
virtcontainers/hyperstart_agent.go 58.76% <0%> (+0.18%) ⬆️
virtcontainers/noop_agent.go 92% <0%> (+0.33%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6f409b9...d0bccab. Read the comment docs.

@nitkon
Copy link
Contributor Author

nitkon commented Jun 20, 2018

@jodh-intel : I see the following error in the failed CI.

ERROR: Invalid URL 'http://download.opensuse.org/repositories/home:clearlinux:preview:clear-containers-2.1/Fedora_25/home:clearlinux:preview:clear-containers-2.1.repo' found in the following files:
./virtcontainers/hack/virtc/README.md
ERROR: Invalid URL 'http://download.opensuse.org/repositories/home:/clearlinux:/preview:/clear-containers-2.1/xUbuntu_16.10/' found in the following files:
./virtcontainers/hack/virtc/README.md

Should the CI be restarted?

@jodh-intel
Copy link
Contributor

Jobs restarted.

@katacontainersbot
Copy link
Contributor

PSS Measurement:
Qemu: 142542 KB
Proxy: 4699 KB
Shim: 8820 KB

Memory inside container:
Total Memory: 2045972 KB
Free Memory: 2007580 KB

@nitkon
Copy link
Contributor Author

nitkon commented Jun 20, 2018

@jodh-intel : All required CI are green

@jodh-intel
Copy link
Contributor

Yes, but we have 1 failure.

You'll also notice that you still need 1 more approval :)

Copy link
Contributor

@grahamwhaley grahamwhaley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice commit!
lgtm
Let me go check that failing CI...

@grahamwhaley
Copy link
Contributor

On the F27 CI:

# time="2018-06-20 10:30:42.502996695Z" level=debug msg="event: "/var/run/crio/exits/975e6d0a7f676053db9d0bf52cbefb7b6487a0a5161366ad36270e5b8ae730b6": CREATE" 
# time="2018-06-20 10:30:42.503002488Z" level=debug msg="container or sandbox exited: 975e6d0a7f676053db9d0bf52cbefb7b6487a0a5161366ad36270e5b8ae730b6" 
# time="2018-06-20 10:30:42.503009278Z" level=debug msg="container exited and found: 975e6d0a7f676053db9d0bf52cbefb7b6487a0a5161366ad36270e5b8ae730b6" 
# rm: cannot remove '/tmp/tmp.BGiXFsa5eZ/crio-run/vfs-containers/5c06d787b94818845e6986598960770958b1c21febf5cf5f69bbf288138e41d3/userdata/shm': Device or resource busy
Build timed out (after 5 minutes). Marking the build as aborted.
Build was aborted

Known and wholly unrelated @jodh-intel @chavafg ? If so, then we can merge this I think (the button is not green, but is 'pressable' ;-) )

@jodh-intel
Copy link
Contributor

/cc @sboeuf

Copy link

@sboeuf sboeuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address the comment about including major/minor inside the qemu struct, but otherwise this PR looks good.

@sboeuf sboeuf merged commit cd842af into kata-containers:master Jul 9, 2018
@sboeuf sboeuf added the bug Incorrect behaviour label Sep 12, 2018
@sboeuf
Copy link

sboeuf commented Sep 12, 2018

@nitkon please backport this PR to stable branches if not present.

@sboeuf
Copy link

sboeuf commented Sep 12, 2018

Nevermind, it's already part of the stable branches.

gabibeyer pushed a commit to gabibeyer/runtime that referenced this pull request Apr 1, 2019
update architecture.md link, since it has moved to within the
design/ directory.

contributes to: kata-containers#417

Signed-off-by: Gabi Beyer <[email protected]>
zklei pushed a commit to zklei/runtime that referenced this pull request Jun 13, 2019
Re-vendor the testify package to obtain the newest assert test
functions.

Shortlog:

    87b1dfb Remove returns info from assertion docs
    bf45a85 Skip vendored packages from tests
    2c9035a Changed mock assertions text output to be non-unicode friendly
    05aa1d4 Remove outdated info from README
    b91bfb9 Move from Godep to dep for dependency management
    9ede17e Drop old Go versions from tests
    8de2544 Fix HTTP assertions to be consistent with the rest
    b57b6d1 Add FileExists and DirExists assertions
    b3596e9 Fxinng After(time.Duration) to wait properly
    cbf22d8 Support nil arguments to Argument Matchers
    42baa3e Nil check in Implements assertion
    b3dfaa9 fail: add test name for logged output
    6e494f9 put closestCall.Arguments to the expected side of the diff
    51464da Consider empty/nil arrays as matching elements
    6f306a6 Reuse aLen and bLen variables
    76de30e Actually fail tests
    ae87ba6 Ran go generate
    8bd27dd Compress some newlines
    bf57a5d ElementsMatch array/slice assertion ignoring order
    0c49dd9 Replace is with in
    9fb9de1 Make NotSubset actually fail the test on nil subset
    8ccf48a Allow a method call to be optional
    aa8279e travis: add go1.9
    c0f1d44 indent actual value for better comparison with expected value
    249123e Implement delta comparison for map values
    88a414d generalize Empty assertion
    2aa2c17 Fix unprotected call fields access in MethodCalled()
    890a5c3 Issue kata-containers#469 fix
    05e8a0e Fix the actuality of InEpsilon
    2f1cd6b time.Duraions are numbers too
    b1f9894 Fix InDelta expected nan check message
    4b92304 Fix actual float conversion error message under calcRelativeError
    f6abca5 Added assert.PanicsWithValue + tests
    e964b17 add MethodCalled to the mock
    46b3c82 Simple validation of Equal/NotEqual assertion arguments
    cc18973 Fix gogenerate travis check
    e179a18 Travis check go generate has been run
    3458981 Run go vet in travis
    c7668ea Fixes kata-containers#339 - Add `assertion`f assertions like Errorf and Equalf
    c33f336 Fix vet warnings and go generate to update docs
    3104bf5 Use Go 1.7 subtests so suites can properly nest
    bd79c01 Fix race condition on mock package's Called
    edd8e19 Run go generate to syn generated assertions
    78be756 Fixed HTTP assertions messages formatting and removed wrong test message
    3a59a58 add Subset and NotSubset assertions
    dd57c7b Check code is formatted in travis
    f712be9 Revert "add mock.MethodCalled(methodName, args...)"
    34687eb Revert "diffArguments: remove unnecessary range-for (kata-containers#417)"
    2b76a97 Revert "Added goconvey style assertions to the mock package"
    b6296e3 fix(docs): correct Error usage examples
    d2f3716 Add msgAndArgs pass forward to InDelta from InDeltaSlice
    158f9d0 Check that there is a directory before trying to access it.
    18cfa68 Added extra unit test for function with mixed variadic arguments.
    b1f1bcb fix typo
    09f61d7 assert: fix error reporting when error contains escape sequences
    5c861cc diffArguments: remove unnecessary range-for (kata-containers#417)
    bc11a6e Tighten language by increasing overall consistency in wording in texts and argument names: use 'actual' instead of 'received'
    115ab90 Provide argument name `args` in function signature
    faf0710 Added goconvey style assertions to the mock package
    97c0e43 compare bytes with bytes.Equal instead of reflect.DeepEqual
    17a0bd5 add mock.MethodCalled(methodName, args...)
    9afdd65 Check number of provided arguments vs mocked
    287336f travis: rm broken go releases
    32d79c5 travis: check if these platforms are broken
    5c9da49 HTTP code status assertions now fail tests
    332ae0e Add Equal test comparing nil with non-nil
    434d5c1 Fixed minor typo
    3928f57 Add comments for Equal and NotEqual to clarify pointer comparison
    13b9dd4 Fix typos in comments in _codegen/main.go
    6835870 Remove isNumericType check
    cbd71e7 When diffing with spew, use a format that doesn't include pointer addresses (which generate false negatives). This updates go-spew to 04cdfd42973bb9c8589fd6a731800cf222fde1a9.
    ddb91ee Ensure that assert.Fail properly align its output
    5e72f93 Remove timestamp from callback
    976c720 Format generated code
    4b9bfb8 run go/format on generated code
    4ccf54a Clearer messages
    bf7a93e Add timestamp
    fcedfe2 Add callbacks to run before and after each test
    8879a01 Unlock and relock not needed anymore. Addresses kata-containers#167.
    4a6e516 Added test to avoid regresions of kata-containers#167.
    547cd8e Release lock before .WaitUntil, as it may cause a deadlock when testing parallel objects. Address kata-containers#167: Unable to test concurrent objects

Signed-off-by: James O. D. Hunt <[email protected]>
zklei pushed a commit to zklei/runtime that referenced this pull request Jun 13, 2019
Add container to sandbox list until it is created.

This commit makes sure to add the container to the sandbox until it is
created.

This helps to avoid other go routines uses the container structure with
nil pointers (because is not fully created).

Fixes: kata-containers#417

Signed-off-by: Jose Carlos Venegas Munoz <[email protected]>
lifupan pushed a commit to lifupan/kata-runtime that referenced this pull request Aug 5, 2020
…nerd-version-2.x

docs: Clarifying minimum version of containerd for annotations
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Incorrect behaviour
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ppc64le: Restrict maxmem depending on the Qemu version
6 participants