-
Notifications
You must be signed in to change notification settings - Fork 374
qemu: refactor maximum vcpus supported in aarch64 #585
Conversation
Build failed (third-party-check pipeline) integration testing with
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Only a few comments.
And also please extend the unit testing for the function MaxQemuVCPUs()
.
virtcontainers/qemu_arm64.go
Outdated
// MaxQemuVCPUs returns the maximum number of vCPUs supported | ||
func MaxQemuVCPUs() uint32 { | ||
bytes, err := ioutil.ReadFile(interruptFile) | ||
if err != nil { | ||
qemuArmLogger().WithError(err).Error("Failed to read /proc/interrrupts") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/interrrupts/interrupts/ You've added one extra r
;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use the variable interruptFile here instead?
virtcontainers/qemu_arm64.go
Outdated
qemuArmLogger().WithError(err).Error("Failed to read /proc/interrrupts") | ||
} | ||
for gicType, vCPUs := range gicList { | ||
pattern := regexp.MustCompile(`\b` + gicType + `\b`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex seems a bit too much here, maybe something like:
if strings.Contains(string(bytes), gicType) {
return vCPUs
}
might be simpler and cost less to process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, that's better, updated asap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when I added support for arm64 I was unable to start a VM with maxcpus
> actual number of physical CPUs
. I think it's a limitation of KVM, having said that, I believe that this PR will fail on systems with gicV3 and actual number of physical CPUs
< 123. @Pennyzct let me know when this PR is ready, I'd like to take a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devimc I have updated the PR. ptal.🙂
@Pennyzct Looks good. Please add a unit test as well. You can override |
PSS Measurement: Memory inside container: |
Build failed (third-party-check pipeline) integration testing with
|
Codecov Report
@@ Coverage Diff @@
## master #585 +/- ##
=======================================
Coverage 65.34% 65.34%
=======================================
Files 85 85
Lines 9846 9846
=======================================
Hits 6434 6434
Misses 2756 2756
Partials 656 656 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just one comment about the source of the GIC version vs. vcpu number mapping.
virtcontainers/qemu_arm64.go
Outdated
//on aarch64, we support different gic interrupt controllers | ||
//maximum number of vCPUs depends on the GIC version, or on how | ||
//many redistributors we can fit into the memory map. | ||
var gicList = map[string]uint32{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are plain magic mapping. Is there any documentation that we can refer to in the public domain? If so, please add it to the above comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.. That is really dark magic. Please refer to these places so that we know where to look at if they need change in future.
And qemu doesn't seem to have gic version 4 ( virt_get_gic_version ). Is it added to be future proof?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
virtcontainers/qemu_arm64.go
Outdated
} | ||
for gicType, vCPUs := range gicList { | ||
if strings.Contains(string(bytes), gicType) { | ||
return vCPUs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this PR in a system with GICv3 and 96 cores. This function returns 123
, unfortunately it won't be honoured since the maximum number of vCPUs can't be exceeded
Lines 215 to 232 in 14bcd69
func (h hypervisor) defaultMaxVCPUs() uint32 { | |
numcpus := uint32(goruntime.NumCPU()) | |
maxvcpus := vc.MaxQemuVCPUs() | |
reqVCPUs := h.DefaultMaxVCPUs | |
//don't exceed the number of physical CPUs. If a default is not provided, use the | |
// numbers of physical CPUs | |
if reqVCPUs >= numcpus || reqVCPUs == 0 { | |
reqVCPUs = numcpus | |
} | |
// Don't exceed the maximum number of vCPUs supported by hypervisor | |
if reqVCPUs > maxvcpus { | |
return maxvcpus | |
} | |
return reqVCPUs | |
} |
My concern with this patch is that the actual number of physical cores will be exceeded and the memory footprint will be big (again) see 07db945
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Pennyzct have you tested this change in a system with GIC >= 3 and physical CPUs > 123 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi~ @devimc That's how I found the problem and pulled this pr😊. I was installing and running kata in new ThunderX II which contains 224 physical cpu cores.
:~# lscpu
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 224
On-line CPU(s) list: 0-223
Thread(s) per core: 4
Core(s) per socket: 28
Socket(s): 2
NUMA node(s): 2
NUMA node0 CPU(s): 0-111
NUMA node1 CPU(s): 112-223
Err occurred and outputs were as follows:
docker: Error response from daemon: OCI runtime create failed: qemu-system-aarch64: Number of SMP CPUs requested (224) exceeds max CPUs supported by machine 'mach-virt' (123): unknown
when applying my patch, vCPUs will be reduced into 123
in func defaultMaxVCPUs
as you mentioned above and kata will run successfully.
Based on upstream discussion, wei @Weichen81 and I issued #614 and #615 to discuss the missing gicv4 scenario and max vCPU varying according to qemu version. This pr may need them landed firstly. 😊 |
PSS Measurement: Memory inside container: |
Build failed (third-party-check pipeline) integration testing with
|
virtcontainers/qemu_arm64.go
Outdated
return uint32(runtime.NumCPU()) | ||
if hostGICVersion != 0 { | ||
return gicList[hostGICVersion] | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this else is not needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update asap😃
virtcontainers/qemu_arm64_test.go
Outdated
|
||
assert.Equal(d.expectedResult, vCPUs) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: extra blank line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update asap😊.
PSS Measurement: Memory inside container: |
Build failed (third-party-check pipeline) integration testing with
|
updated. ptal. @devimc @jodh-intel 😊 |
Nasty 16.04 CI failure which seems to be hotplug-related. Using a Clear Linux image (kernel
Note: There's a bit of corruption on the
/cc @devimc, @grahamwhaley. |
lgtm |
@jodh-intel about the kernel trace. Considering that happen when CPU hotplug is happening and taking a look to the code. Seems that fits in the first case that is documented, so I think that is expected(?). @mcastelino @devimc @grahamwhaley * We should be running this queue from one of the CPUs that
* are mapped to it.
*
* There are at least two related races now between setting
* hctx->next_cpu from blk_mq_hctx_next_cpu() and running
* __blk_mq_run_hw_queue():
*
* - hctx->next_cpu is found offline in blk_mq_hctx_next_cpu(),
* but later it becomes online, then this warning is harmless
* at all
* |
@jcvenegas - good find! It seems like the kernel is dumping debug info to help the devs debug the races referred to there but atleast it's not a "real" crash :) |
Build succeeded (third-party-check pipeline).
|
on aarch64, we support different gic interrupt controllers. The maximum number of vCPUs depends on the GIC version, or on how many redistributors we can fit into the memory map. Fixes: kata-containers#584 Signed-off-by: Penny Zheng <[email protected]> Signed-off-by: Wei Chen <[email protected]>
we should add unit test for func MaxQemuVCPUS in qemu_amd64_test.go Signed-off-by: Penny Zheng <[email protected]> Signed-off-by: Wei Chen <[email protected]>
just re-based it on the latest master branch. 😊. @jodh-intel |
PSS Measurement: Memory inside container: |
Build failed (third-party-check pipeline) integration testing with
|
@devimc could you give it a last review to this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
agent: sandbox_pause should not take arguments
on aarch64, we support different gic interrupt controllers.
The maximum number of vCPUs depends on the GIC version, or on how many redistributors we can fit into the memory map.
Fixes: #584
Signed-off-by: Penny Zheng [email protected]
Signed-off-by: Wei Chen [email protected]