-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up PCI path handling #1190
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
dgibson
added
wip
Work in Progress (PR incomplete - needs more work or rework)
no-backport-needed
forward-port
Change from an older branch / repository
labels
Dec 11, 2020
/test-vfio |
dgibson
force-pushed
the
pcipath
branch
2 times, most recently
from
December 17, 2020 04:44
254580d
to
771a3ea
Compare
/test |
jodh-intel
reviewed
Jan 6, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dgibson! A few comments...
/test |
/test |
/retest |
1 similar comment
/retest |
dgibson
force-pushed
the
pcipath
branch
2 times, most recently
from
January 15, 2021 01:14
8e208dd
to
fa5af6e
Compare
/test |
dgibson
force-pushed
the
pcipath
branch
3 times, most recently
from
January 20, 2021 04:46
3ca2b1b
to
7f06293
Compare
/test |
/retest |
/test |
jodh-intel
approved these changes
Jan 21, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dgibson.
get_pci_device_address() has pretty confusing semantics. Both its input and output are in other parts of the code described as a "PCI address", but neither is *actually* a PCI address (in the standard DDDD:BB:DD.F format). What it's really about is resolving a "PCI path" - that is way to locate a PCI device by using it's slot number and the slot number of the bridge leading to it - into a sysfs path. Rename the function, and change a bunch of variable names to make those semantics clearer. Forward port of kata-containers/agent@0eb612f06484 Signed-off-by: David Gibson <[email protected]>
Currently pcipath_to_sysfs(), which translates PCI paths into sysfs paths accepts only pci paths with exactly 2 components; which represents PCI devices separated from the root bus by exactly one PCI to PCI bridge (which could be a virtual P2P bridge, such as a PCI-E root port). There are cases we might reasonably want to support which have devices either plugged directly into the root bus (zero bridges), or under multiple layers of P2P bridge (a PCI-E switch would require at least 2 layers). So, generalize pcipath_to_sysfs to support any number of components in the PCI path. We also make it use the new type for PCI paths internally rather than plain strings. This is a loose forward port of kata-containers/agent@9804b1e fixes kata-containers#1040 Signed-off-by: David Gibson <[email protected]>
pcipath_to_sysfs takes a PCI path, with a particular format. A number of places implicitly need strings in that format, many of them repeat the description. To make things safer and briefer use the pci::Path type for the purpose more widely, and just describe the string formatting of it at the type definition. Then, update variable names and comments throughout to call things in this format "PCI path", rather than "PCI identifier", which is vague, or "PCI address" which is just plain wrong. Likewise we change names and comments which incorrectly refer to sysfs paths as a "PCI address". This changes the grpc proto definitions, but because it's just changing the name of a field without changing the field number, it shouldn't change the actual protocol. A loose forward port of kata-containers/agent@da4bc1d Signed-off-by: David Gibson <[email protected]>
Currently pcipath_to_sysfs() generates the path to the root bus node in sysfs via create_pci_root_bus_path(). This is inconvenient for testing, though, so instead make it take this as a parameter and generate the path in the (single) caller. As a bonus this will make life a bit easier when we want to support machines with multiple PCI roots. Signed-off-by: David Gibson <[email protected]>
Port this test from the Kata 1 Go agent to the Kata 2 Rust agent. Signed-off-by: David Gibson <[email protected]>
The "PCI address" returned by Endpoint::PciPath() isn't actually a PCI address (DDDD:BB:DD.F), but rather a PCI path. Rename and use the PciPath type to clean this up and the various parts of the network code connected to it. Forward port of kata-containers/runtime@3e58971 fixes kata-containers#1040 Signed-off-by: David Gibson <[email protected]>
BlockDrive::PCIAddr doesn't actually store a PCI address (DDDD:BB:DD.F) but a PCI path. Use the PciPath type and rename things to make that clearer. TestHandleBlockVolume() previously used a bizarre value "0002:01" for the "PCI address" which was neither an actual PCI address, nor a PCI path. Update it to use a PCI path - the actual value appears not to matter in this test, as long as its consistent throughout. Forward port of kata-containers/runtime@64751f3 fixes kata-containers#1040 Signed-off-by: David Gibson <[email protected]>
VhostUserDeviceAttrs::PCIAddr didn't actually store a PCI address (DDDD:BB:DD.F), but rather a PCI path. Use the PciPath type and rename things to make that clearer. TestHandleBlockVolume previously used the bizarre value "0001:01" which is neither a PCI address nor a PCI path for this value. Change it to a valid PCI path - it appears the actual value didn't matter for that test, as long as it was consistent. Forward port of kata-containers/runtime@3596058 fixes kata-containers#1040 Signed-off-by: David Gibson <[email protected]>
/test |
/test-vfio |
dgibson
added a commit
to dgibson/kata-containers
that referenced
this pull request
Mar 12, 2021
Currently runtime and agent special case virtio-blk devices under clh, ostensibly because the PCI address information is not available in that case. In fact, cloud-hypervisor's VmAddDiskPut API does return a PciDeviceInfo, which includes a PCI address. That API is broken, because PCI addressing depends on guest (firmware or OS) actions that the hypervisor won't know about. clh only gets away with this because it only uses a single PCI root and never uses PCI bridges, in which case the guest addresses are accurately predictable: they always have domain and bus zero. Until kata-containers#1190, Kata couldn't handle PCI addressing unless there was exactly one bridge, which might be why this was actually special-cased for clh. With kata-containers#1190 merged, we can handle more general PCI paths, and we can derive a trivial (one element) PCI path from the information that the clh API gives us. We can use that to remove this special case. fixes kata-containers#1431 Signed-off-by: David Gibson <[email protected]>
dgibson
added a commit
to dgibson/kata-containers
that referenced
this pull request
Mar 24, 2021
Currently runtime and agent special case virtio-blk devices under clh, ostensibly because the PCI address information is not available in that case. In fact, cloud-hypervisor's VmAddDiskPut API does return a PciDeviceInfo, which includes a PCI address. That API is broken, because PCI addressing depends on guest (firmware or OS) actions that the hypervisor won't know about. clh only gets away with this because it only uses a single PCI root and never uses PCI bridges, in which case the guest addresses are accurately predictable: they always have domain and bus zero. Until kata-containers#1190, Kata couldn't handle PCI addressing unless there was exactly one bridge, which might be why this was actually special-cased for clh. With kata-containers#1190 merged, we can handle more general PCI paths, and we can derive a trivial (one element) PCI path from the information that the clh API gives us. We can use that to remove this special case. fixes kata-containers#1431 Signed-off-by: David Gibson <[email protected]>
dgibson
added a commit
to dgibson/kata-containers
that referenced
this pull request
Apr 9, 2021
Currently runtime and agent special case virtio-blk devices under clh, ostensibly because the PCI address information is not available in that case. In fact, cloud-hypervisor's VmAddDiskPut API does return a PciDeviceInfo, which includes a PCI address. That API is broken, because PCI addressing depends on guest (firmware or OS) actions that the hypervisor won't know about. clh only gets away with this because it only uses a single PCI root and never uses PCI bridges, in which case the guest addresses are accurately predictable: they always have domain and bus zero. Until kata-containers#1190, Kata couldn't handle PCI addressing unless there was exactly one bridge, which might be why this was actually special-cased for clh. With kata-containers#1190 merged, we can handle more general PCI paths, and we can derive a trivial (one element) PCI path from the information that the clh API gives us. We can use that to remove this special case. fixes kata-containers#1431 Signed-off-by: David Gibson <[email protected]>
dgibson
added a commit
to dgibson/kata-containers
that referenced
this pull request
Apr 13, 2021
Currently runtime and agent special case virtio-blk devices under clh, ostensibly because the PCI address information is not available in that case. In fact, cloud-hypervisor's VmAddDiskPut API does return a PciDeviceInfo, which includes a PCI address. That API is broken, because PCI addressing depends on guest (firmware or OS) actions that the hypervisor won't know about. clh only gets away with this because it only uses a single PCI root and never uses PCI bridges, in which case the guest addresses are accurately predictable: they always have domain and bus zero. Until kata-containers#1190, Kata couldn't handle PCI addressing unless there was exactly one bridge, which might be why this was actually special-cased for clh. With kata-containers#1190 merged, we can handle more general PCI paths, and we can derive a trivial (one element) PCI path from the information that the clh API gives us. We can use that to remove this special case. fixes kata-containers#1431 Signed-off-by: David Gibson <[email protected]>
dgibson
added a commit
to dgibson/kata-containers
that referenced
this pull request
Apr 13, 2021
Currently runtime and agent special case virtio-blk devices under clh, ostensibly because the PCI address information is not available in that case. In fact, cloud-hypervisor's VmAddDiskPut API does return a PciDeviceInfo, which includes a PCI address. That API is broken, because PCI addressing depends on guest (firmware or OS) actions that the hypervisor won't know about. clh only gets away with this because it only uses a single PCI root and never uses PCI bridges, in which case the guest addresses are accurately predictable: they always have domain and bus zero. Until kata-containers#1190, Kata couldn't handle PCI addressing unless there was exactly one bridge, which might be why this was actually special-cased for clh. With kata-containers#1190 merged, we can handle more general PCI paths, and we can derive a trivial (one element) PCI path from the information that the clh API gives us. We can use that to remove this special case. fixes kata-containers#1431 Signed-off-by: David Gibson <[email protected]>
Jakob-Naucke
added a commit
to Jakob-Naucke/kata-containers
that referenced
this pull request
Mar 8, 2022
Initial VFIO-AP support (kata-containers#578) was simple, but somewhat hacky; a different code path would be chosen for performing the hotplug, and agent-side device handling was bound to knowing the assigned queue numbers (APQNs) through some other means; plus the code for awaiting them was written for the Go agent and never released. This code also artificially increased the hotplug timeout to wait for the (relatively expensive, thus limited to 5 seconds at the quickest) AP rescan, which is impractical for e.g. common k8s timeouts. Since then, the general handling logic was improved (kata-containers#1190), but it assumed PCI in several places. In the runtime, introduce and parse AP devices. Annotate them as such when passing to the agent, and information about the enabled APQNs is included. The agent awaits the passed APQNs through uevents and triggers a rescan directly. Fixes: kata-containers#3678 Signed-off-by: Jakob Naucke <[email protected]>
Jakob-Naucke
added a commit
to Jakob-Naucke/kata-containers
that referenced
this pull request
Mar 11, 2022
Initial VFIO-AP support (kata-containers#578) was simple, but somewhat hacky; a different code path would be chosen for performing the hotplug, and agent-side device handling was bound to knowing the assigned queue numbers (APQNs) through some other means; plus the code for awaiting them was written for the Go agent and never released. This code also artificially increased the hotplug timeout to wait for the (relatively expensive, thus limited to 5 seconds at the quickest) AP rescan, which is impractical for e.g. common k8s timeouts. Since then, the general handling logic was improved (kata-containers#1190), but it assumed PCI in several places. In the runtime, introduce and parse AP devices. Annotate them as such when passing to the agent, and information about the enabled APQNs is included. The agent awaits the passed APQNs through uevents and triggers a rescan directly. Fixes: kata-containers#3678 Signed-off-by: Jakob Naucke <[email protected]>
Jakob-Naucke
added a commit
to Jakob-Naucke/kata-containers
that referenced
this pull request
Mar 17, 2022
Initial VFIO-AP support (kata-containers#578) was simple, but somewhat hacky; a different code path would be chosen for performing the hotplug, and agent-side device handling was bound to knowing the assigned queue numbers (APQNs) through some other means; plus the code for awaiting them was written for the Go agent and never released. This code also artificially increased the hotplug timeout to wait for the (relatively expensive, thus limited to 5 seconds at the quickest) AP rescan, which is impractical for e.g. common k8s timeouts. Since then, the general handling logic was improved (kata-containers#1190), but it assumed PCI in several places. In the runtime, introduce and parse AP devices. Annotate them as such when passing to the agent, and information about the enabled APQNs is included. The agent awaits the passed APQNs through uevents and triggers a rescan directly. Fixes: kata-containers#3678 Signed-off-by: Jakob Naucke <[email protected]>
Jakob-Naucke
added a commit
to Jakob-Naucke/kata-containers
that referenced
this pull request
Mar 24, 2022
Initial VFIO-AP support (kata-containers#578) was simple, but somewhat hacky; a different code path would be chosen for performing the hotplug, and agent-side device handling was bound to knowing the assigned queue numbers (APQNs) through some other means; plus the code for awaiting them was written for the Go agent and never released. This code also artificially increased the hotplug timeout to wait for the (relatively expensive, thus limited to 5 seconds at the quickest) AP rescan, which is impractical for e.g. common k8s timeouts. Since then, the general handling logic was improved (kata-containers#1190), but it assumed PCI in several places. In the runtime, introduce and parse AP devices. Annotate them as such when passing to the agent, and include information about the associated APQNs. The agent awaits the passed APQNs through uevents and triggers a rescan directly. Fixes: kata-containers#3678 Signed-off-by: Jakob Naucke <[email protected]>
Jakob-Naucke
added a commit
to Jakob-Naucke/kata-containers
that referenced
this pull request
Mar 28, 2022
Initial VFIO-AP support (kata-containers#578) was simple, but somewhat hacky; a different code path would be chosen for performing the hotplug, and agent-side device handling was bound to knowing the assigned queue numbers (APQNs) through some other means; plus the code for awaiting them was written for the Go agent and never released. This code also artificially increased the hotplug timeout to wait for the (relatively expensive, thus limited to 5 seconds at the quickest) AP rescan, which is impractical for e.g. common k8s timeouts. Since then, the general handling logic was improved (kata-containers#1190), but it assumed PCI in several places. In the runtime, introduce and parse AP devices. Annotate them as such when passing to the agent, and include information about the associated APQNs. The agent awaits the passed APQNs through uevents and triggers a rescan directly. Fixes: kata-containers#3678 Signed-off-by: Jakob Naucke <[email protected]>
Jakob-Naucke
added a commit
to Jakob-Naucke/kata-containers
that referenced
this pull request
Mar 29, 2022
Initial VFIO-AP support (kata-containers#578) was simple, but somewhat hacky; a different code path would be chosen for performing the hotplug, and agent-side device handling was bound to knowing the assigned queue numbers (APQNs) through some other means; plus the code for awaiting them was written for the Go agent and never released. This code also artificially increased the hotplug timeout to wait for the (relatively expensive, thus limited to 5 seconds at the quickest) AP rescan, which is impractical for e.g. common k8s timeouts. Since then, the general handling logic was improved (kata-containers#1190), but it assumed PCI in several places. In the runtime, introduce and parse AP devices. Annotate them as such when passing to the agent, and include information about the associated APQNs. The agent awaits the passed APQNs through uevents and triggers a rescan directly. Fixes: kata-containers#3678 Signed-off-by: Jakob Naucke <[email protected]>
BbolroC
pushed a commit
to BbolroC/kata-containers
that referenced
this pull request
Jul 4, 2022
Initial VFIO-AP support (kata-containers#578) was simple, but somewhat hacky; a different code path would be chosen for performing the hotplug, and agent-side device handling was bound to knowing the assigned queue numbers (APQNs) through some other means; plus the code for awaiting them was written for the Go agent and never released. This code also artificially increased the hotplug timeout to wait for the (relatively expensive, thus limited to 5 seconds at the quickest) AP rescan, which is impractical for e.g. common k8s timeouts. Since then, the general handling logic was improved (kata-containers#1190), but it assumed PCI in several places. In the runtime, introduce and parse AP devices. Annotate them as such when passing to the agent, and include information about the associated APQNs. The agent awaits the passed APQNs through uevents and triggers a rescan directly. Fixes: kata-containers#3678 Signed-off-by: Jakob Naucke <[email protected]>
BbolroC
pushed a commit
to BbolroC/kata-containers
that referenced
this pull request
Jul 4, 2022
Initial VFIO-AP support (kata-containers#578) was simple, but somewhat hacky; a different code path would be chosen for performing the hotplug, and agent-side device handling was bound to knowing the assigned queue numbers (APQNs) through some other means; plus the code for awaiting them was written for the Go agent and never released. This code also artificially increased the hotplug timeout to wait for the (relatively expensive, thus limited to 5 seconds at the quickest) AP rescan, which is impractical for e.g. common k8s timeouts. Since then, the general handling logic was improved (kata-containers#1190), but it assumed PCI in several places. In the runtime, introduce and parse AP devices. Annotate them as such when passing to the agent, and include information about the associated APQNs. The agent awaits the passed APQNs through uevents and triggers a rescan directly. Fixes: kata-containers#3678 Signed-off-by: Jakob Naucke <[email protected]>
BbolroC
pushed a commit
to BbolroC/kata-containers
that referenced
this pull request
Oct 20, 2022
Initial VFIO-AP support (kata-containers#578) was simple, but somewhat hacky; a different code path would be chosen for performing the hotplug, and agent-side device handling was bound to knowing the assigned queue numbers (APQNs) through some other means; plus the code for awaiting them was written for the Go agent and never released. This code also artificially increased the hotplug timeout to wait for the (relatively expensive, thus limited to 5 seconds at the quickest) AP rescan, which is impractical for e.g. common k8s timeouts. Since then, the general handling logic was improved (kata-containers#1190), but it assumed PCI in several places. In the runtime, introduce and parse AP devices. Annotate them as such when passing to the agent, and include information about the associated APQNs. The agent awaits the passed APQNs through uevents and triggers a rescan directly. Fixes: kata-containers#3678 Signed-off-by: Jakob Naucke <[email protected]>
BbolroC
pushed a commit
to BbolroC/kata-containers
that referenced
this pull request
Feb 24, 2023
Initial VFIO-AP support (kata-containers#578) was simple, but somewhat hacky; a different code path would be chosen for performing the hotplug, and agent-side device handling was bound to knowing the assigned queue numbers (APQNs) through some other means; plus the code for awaiting them was written for the Go agent and never released. This code also artificially increased the hotplug timeout to wait for the (relatively expensive, thus limited to 5 seconds at the quickest) AP rescan, which is impractical for e.g. common k8s timeouts. Since then, the general handling logic was improved (kata-containers#1190), but it assumed PCI in several places. In the runtime, introduce and parse AP devices. Annotate them as such when passing to the agent, and include information about the associated APQNs. The agent awaits the passed APQNs through uevents and triggers a rescan directly. Fixes: kata-containers#3678 Signed-off-by: Jakob Naucke <[email protected]>
BbolroC
pushed a commit
to BbolroC/kata-containers
that referenced
this pull request
Mar 16, 2023
Initial VFIO-AP support (kata-containers#578) was simple, but somewhat hacky; a different code path would be chosen for performing the hotplug, and agent-side device handling was bound to knowing the assigned queue numbers (APQNs) through some other means; plus the code for awaiting them was written for the Go agent and never released. This code also artificially increased the hotplug timeout to wait for the (relatively expensive, thus limited to 5 seconds at the quickest) AP rescan, which is impractical for e.g. common k8s timeouts. Since then, the general handling logic was improved (kata-containers#1190), but it assumed PCI in several places. In the runtime, introduce and parse AP devices. Annotate them as such when passing to the agent, and include information about the associated APQNs. The agent awaits the passed APQNs through uevents and triggers a rescan directly. Fixes: kata-containers#3678 Signed-off-by: Jakob Naucke <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Consistent and safer handling, forward port of kata-containers/agent#855 and kata-containers/runtime#3003.