Skip to content

Commit

Permalink
Update based on discussion
Browse files Browse the repository at this point in the history
Signed-off-by: James Sturtevant <[email protected]>
  • Loading branch information
jsturtevant committed Jul 27, 2022
1 parent 55e3838 commit 1fa4600
Showing 1 changed file with 26 additions and 38 deletions.
64 changes: 26 additions & 38 deletions keps/sig-node/2371-cri-pod-container-stats/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
- [cAdvisor Metrics Endpoint](#cadvisor-metrics-endpoint)
- [CRI implementations](#cri-implementations)
- [cAdvisor](#cadvisor-1)
- [Windows](#windows)
- [Test Plan](#test-plan)
- [Prerequisite testing updates](#prerequisite-testing-updates)
- [Unit tests](#unit-tests)
Expand Down Expand Up @@ -274,23 +275,24 @@ Thus, this KEP largely the plan described [here](#plan), with some changes:
1. For the newly introduced CRI API fields, should there be Windows and Linux specific fields?
For the alpha implementation, [PR #102789](https://github.com/kubernetes/kubernetes/pull/102789) added a Linux specific field `LinuxPodSandboxStats`. It also added Windows Specific field called `WindowsPodSandboxStats` but left it blank.

Since this field has been added it is proposed to use the opportunity to create structs that are Specific to Windows. The motivation
behind this is Windows has differences in Memory stats that are specific to its OS and doesn't (in some cases cannot) fill fields such as `rss_bytes`.
The proposal for Widnows support is to add a field `PodSandboxStats` which will be used by both Linux and Windows. Windows and
Linux would both use this field filling in stats that make most sense for them. Windows is currently missing several stats but can reasonably fill in many of the missing fields (see the [Windows](#windows) section below). This would be similar approach that is use today for the current CRI `ListContainerStats` call and this would make logic in kubelet more straight forward.

A challenge with new `WindowsPodSandboxStats` which custom fields is that the current calls to CRI endpoint `ListContainerStats`
use the exiting `ContainerStats` object which is not Windows/Linux specific. In the case of the CRI call `ListPodSandboxStats`
it would currently return existing fields. There would be a miss match between the structs returned between these calls unless we update. Making changes to `ContainerStats` in backwards compatible way may be difficult.
The field `LinuxPodSandboxStats` would stay the same (with duplicated fields) for backwards compatibility. We will also leave `WindowsPodSandboxStats` with a comment that it is not currently used. This option allows us to add specific fields in the future, Linux and Windows sub fields could be added to `LinuxPodSandboxStats` and `WindowsPodSandboxStats` if there are use cases for specific fields.

The downside to this approach is that we have two fields (`LinuxPodSandboxStats` and `PodSandboxStats`) which are initially identical. We could mark the individual fields in the `LinuxPodSandboxStats` as depreciated and allow only new linux only fields be added to `LinuxPodSandboxStats`.

Alternatives include:

- Rename the field `LinuxPodSandboxStats` to `PodSandboxStats`. Windows and
Linux would both use this field filling in stats that make most sense for them. This would be similar
approach that is use today for the current CRI `ListContainerStats` call. This would remove the issue above of
miss-match between the `LinuxPodSandboxStats` and `ListContainerStats` and make logic in kubelet more straight forward.
- Leave the two fields `LinuxPodSandboxStats` and `WindowsPodSandboxStats` but use same fields. This might be the easiest but seem counter-intuitive and still adds complexity to the Kubelet implementation.

- Create stats structs that are specific for Windows. The motivation behind this is Windows has differences in Memory stats that are specific to its OS and
doesn't fill fields (in some cases cannot such as `rss_bytes`). A challenge with new `WindowsPodSandboxStats` with custom fields is that the current calls to
CRI endpoint `ListContainerStats` use the exiting `ContainerStats` object which is not Windows/Linux specific. In the case of the CRI call `ListPodSandboxStats`it would currently return existing fields. There would be a miss match between the structs returned between these calls unless we update.
Making changes to `ContainerStats` in backwards compatible way may be difficult. We also don't have solid use cases for the consumption of Windows specific
stats today, which makes the complexity the option introduces difficult to justify.
- Leave the two fields `LinuxPodSandboxStats` and `WindowsPodSandboxStats` but use same fields. This might be the easiest but is counter-intuitive and still adds complexity to the Kubelet implementation. We also end up with Linux specific fields in the `WindowsPodSandboxStats`.
- Create Windows specific structs for every metric (CPU, Network, Filesystem Usage).
This would allow for most flexibility but would add lots of over head to the way the logic is handled in Kubelet.
Considering the consumers of this API having some consistency is best. This also has all the downside of other alternatives such as mis match between `ListContainerStats` and `ListPodSandboxStats`.
Considering the consumers (metric server and kubelet itself) of this API having some consistency is best. This also has all the downside of other alternatives such as mis match between `ListContainerStats` and `ListPodSandboxStats`.

See more information in the [ContainerStats](#containerstats-additions) and [Windows](#windows) sections below which gives details on proposal and Windows stats differences.

Expand Down Expand Up @@ -436,9 +438,13 @@ message PodSandboxStats {
LinuxPodSandboxStats linux = 2;
// Stats from windows.
WindowsPodSandboxStats windows = 3;
// Generic Stats that apply across OS
PodSandboxStats = 4;
}
// LinuxPodSandboxStats provides the resource usage statistics for a pod sandbox on linux.
// This field is here for backwards compatibility and future Linux only stats.
// PodSandboxStats is the preferred struct to be used.
message LinuxPodSandboxStats {
// CPU usage gathered for the pod sandbox.
CpuUsage cpu = 1;
Expand All @@ -453,40 +459,22 @@ message LinuxPodSandboxStats {
}
// WindowsPodSandboxStats provides the resource usage statistics for a pod sandbox on windows
// Not currently used
message WindowsPodSandboxStats {
}
// PodSandboxStats provides the resource usage statistics for a Pod Sandbox
message PodSandboxStats {
// CPU usage gathered for the pod sandbox.
CpuUsage cpu = 1;
// Memory usage gathered for the pod sandbox.
WindowsMemoryUsage memory = 2;
MemoryUsage memory = 2;
// Network usage gathered for the pod sandbox
NetworkUsage network = 3;
// Stats pertaining to processes in the pod sandbox.
ProcessUsage process = 4;
// Stats of containers in the measured pod sandbox.
repeated WindowsContainerStats containers = 5;
}
message WindowsMemoryUsage {
// Timestamp in nanoseconds at which the information were collected. Must be > 0.
int64 timestamp = 1;
// The amount of working set memory in bytes.
UInt64Value working_set_bytes = 2;
// Available memory for use. This is defined as the memory limit - workingSetBytes.
UInt64Value available_bytes = 3;
// Cumulative number of page faults.
UInt64Value page_faults = 4;
// could add more windows specific fields in future (PrivateWorkingSetBytes, CommitPeakBytes, etc)
}
message WindowsContainerStats {
// Information of the container.
ContainerAttributes attributes = 1;
// CPU usage gathered from the container.
CpuUsage cpu = 2;
// Memory usage gathered from the container.
WindowsMemoryUsage memory = 3;
// Usage of the writable layer.
FilesystemUsage writable_layer = 4;
repeated ContainerStats containers = 5;
}
// NetworkUsage contains data about network resources.
Expand Down Expand Up @@ -644,7 +632,7 @@ These are not used currently but are be very specific to each implementation and
| tx\_bytes | uint67 | BytesSent | | same values can be used? |
| tx\_errors | uint68 | | | should use DroppedPacketsOutgoing |

Based on the above settings, we should stay conservative and expose a small set of working overlapping stats and add as needed. This is what is proposed in the [changes cri](#cri-implementation)
Based on the above settings, we should stay conservative and expose the existing set of working overlapping stats and add platform specific fields as needed. This is what is proposed in the [changes cri](#cri-implementation)

### Test Plan

Expand Down

0 comments on commit 1fa4600

Please sign in to comment.