-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STOR-2040: CLI command to display bound pvc filesystem usage percentage #1854
base: master
Are you sure you want to change the base?
Conversation
@gmeghnag: This pull request references STOR-2040 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: gmeghnag The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thank you for spending time on this nice and useful feature. But that requires a KEP first to sync on the feature and implementation. |
Hey @ardaguclu, what is a KEP? And how to create one? Thanks :) |
I think, you'd write an enhancement proposal under https://github.com/openshift/enhancements/tree/master/enhancements/oc and we'll discuss the design and align on it. Once it merges, we can return to this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @gmeghnag :) This is nice work, I just have a few suggestions.
urlParams := url.Values{} | ||
urlParams.Set("query", query) | ||
uri.RawQuery = urlParams.Encode() | ||
|
||
persistentVolumeClaimsBytes, err := getWithBearer(ctx, getRoute, "openshift-monitoring", "prometheus-k8s", uri, bearerToken, insecureTLS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get this error when using system:admin KUBECONFIG, but not for the other oc adm top
subcommands:
$ ./oc adm top pvc
error: failed to get persistentvolumeclaims from Prometheus: no token is currently in use for this session
Is it possible to avoid this error by using the metrics API interface instead of constructing a raw request w/ bearer token?
oc/vendor/k8s.io/kubectl/pkg/cmd/top/top_pod.go
Lines 223 to 248 in a0501f4
func getMetricsFromMetricsAPI(metricsClient metricsclientset.Interface, namespace, resourceName string, allNamespaces bool, labelSelector labels.Selector, fieldSelector fields.Selector) (*metricsapi.PodMetricsList, error) { | |
var err error | |
ns := metav1.NamespaceAll | |
if !allNamespaces { | |
ns = namespace | |
} | |
versionedMetrics := &metricsv1beta1api.PodMetricsList{} | |
if resourceName != "" { | |
m, err := metricsClient.MetricsV1beta1().PodMetricses(ns).Get(context.TODO(), resourceName, metav1.GetOptions{}) | |
if err != nil { | |
return nil, err | |
} | |
versionedMetrics.Items = []metricsv1beta1api.PodMetrics{*m} | |
} else { | |
versionedMetrics, err = metricsClient.MetricsV1beta1().PodMetricses(ns).List(context.TODO(), metav1.ListOptions{LabelSelector: labelSelector.String(), FieldSelector: fieldSelector.String()}) | |
if err != nil { | |
return nil, err | |
} | |
} | |
metrics := &metricsapi.PodMetricsList{} | |
err = metricsv1beta1api.Convert_v1beta1_PodMetricsList_To_metrics_PodMetricsList(versionedMetrics, metrics, nil) | |
if err != nil { | |
return nil, err | |
} | |
return metrics, nil | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIU the metricsclientset API allows consumers to access only resource metrics (CPU and memory) for pods and nodes.
So if we strictly want system:admin
users authenticated using the KUBECONFIG
file to use that command we should evaluate:
- getting a token from a different place, maybe a secret containing it?! (I will try to check if any exists)
- using a different workflow to obtain this information rather than Prometheus API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found out that the token from the secret localhost-recovery-client-token
in openshift-kube-apiserver
namespace is a valid one to use:
oc get secret -n openshift-kube-apiserver localhost-recovery-client-token -o json | jq '.data.token' -r | base64 -d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make this modification before next week so we can try to merge this PR, as @tsmetana agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
oc/pkg/cli/admin/top/persistentvolumeclaims.go
Lines 150 to 165 in a3cf814
o.BearerToken = o.RESTConfig.BearerToken | |
if len(o.RESTConfig.BearerToken) == 0 { | |
klog.V(4).Info("no token is currently in use for this session") | |
klog.V(5).Info("attempting to retrieve token from secret \"localhost-recovery-client-token\" in namespace \"openshift-kube-apiserver\"") | |
secret, err := o.ClientSet.CoreV1().Secrets("openshift-kube-apiserver").Get(context.TODO(), "localhost-recovery-client-token", metav1.GetOptions{}) | |
if err != nil { | |
klog.V(4).Info(fmt.Errorf("error retrieving secret: %s", err.Error())) | |
return fmt.Errorf("no token is currently in use for this session") | |
} | |
localhostRecoveryToken, exist := secret.Data["token"] | |
if !exist { | |
klog.V(4).Info(fmt.Errorf("\"token\" key not found in secret \"localhost-recovery-client-token\" in namespace \"openshift-kube-apiserver\"")) | |
return fmt.Errorf("no token is currently in use for this session") | |
} | |
o.BearerToken = string(localhostRecoveryToken) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
} | ||
|
||
} | ||
headers := []string{"NAMESPACE", "NAME", "USAGE(%)"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have some additional fields? My suggestion would be NAMESPACE, NAME, VOLUME, CAPACITY, USED, USAGE(%).
VOLUME: name of the PV attached to the PVC so the user can easily see which volume it corresponds to.
CAPACITY: total capacity of the volume in multiple of bytes (same as oc get pv
, for example 10Gi
or 2Ti
).
USED: total used space of the volume in multiple of bytes (same format as capacity bytes).
This would be useful to mention in the enhancement that @ardaguclu requested, in case anyone has other suggestions (or objections).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For VOLUME, CAPACITY, USED
it would require code modification using a different approach, maybe we can do that in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this one. The default outuput format should not change once this gets GA'd. It would efficiently become an API (a script parsing the output should not get broken by the update), so adding new columns at least to the default output might not be as simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, so the option we have in the future to implement these other column fields without changing the default output is to use -o wide
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This plan sounds fine to me, thanks for considering it.
// if more pvc are requested as args but one of them does not not exist | ||
if len(args) != 0 && len(promOutput.Data.Result) != len(args) { | ||
resultingPvc := make(map[string]bool) | ||
for _, _promOutputDataResult := range promOutput.Data.Result { | ||
pvcName, _ := _promOutputDataResult.Metric["persistentvolumeclaim"] | ||
resultingPvc[pvcName] = true | ||
} | ||
for _, arg := range args { | ||
_, pvcPresentInResult := resultingPvc[arg] | ||
if !pvcPresentInResult { | ||
return fmt.Errorf("persistentvolumeclaim %q not found in %s namespace.", arg, o.Namespace) | ||
} | ||
} | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will immediately return an error if one PVC is not found. Could it instead return the valid ones + errors for the ones that do not exist?
For example:
$ oc get pvc -n testns claim1 foo bar
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
claim1 Bound pvc-ecf81046-51c4-4fa2-ada3-7797faa670b4 1Gi RWO gp3-csi <unset> 5h14m
Error from server (NotFound): persistentvolumeclaims "foo" not found
Error from server (NotFound): persistentvolumeclaims "bar" not found
You could add resultingPvc[pvcName] = true
to the promOutput.Data.Result
loop below so you only have to loop through it once, then check for missing PVC's to print error messages at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I still think this would be a useful change to make, but I don't consider it a blocking issue for this PR.
/retest |
/test e2e-agnostic-ovn-cmd |
/retest |
DO NOT MERGE, I have to fix first https://issues.redhat.com/browse/OCPBUGS-44011 |
/retest |
persistentVolumeClaimsBytes, err := getWithBearer(ctx, getRoute, "openshift-monitoring", "prometheus-k8s", uri, bearerToken, insecureTLS) | ||
if err != nil { | ||
klog.V(4).Info(fmt.Errorf("failed to get persistentvolumeclaims from Prometheus: %w", err)) | ||
return persistentVolumeClaimsBytes, fmt.Errorf("metrics not available yet") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This came from the pre-merge testing by @ropatil010: The "metrics not available yet" might be quite misleading message, i.e. in the case the user lacks permissions to get to the metrics. Wouldn't it be easier to just return the the original error here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- "Wouldn't it be easier to just return the the original error here as well?"
Yes, I agree and it was like this before, then I changed the behaviour to accommodate the request made here.
My mistake for not insisting on keeping the old behaviour, let me know whether to proceed with the change @tsmetana .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still believe that this is a cool feature and thanks for spending time on this. I think it requires major code refactorings though.
/label tide/merge-method-squash
@@ -0,0 +1,382 @@ | |||
package top |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you'll be the maintainer of this command, there should be a separate OWNERS file like this https://github.com/openshift/oc/blob/master/pkg/cli/admin/inspectalerts/OWNERS
@ardaguclu Thank you a lot for the code review! Before spending time answering your questions and proceeding with the code modifications I would like to ask @tsmetana if doing so will make this PR merged, because, from our last chat I understood that you (with the storage leads) chose to go upstream first with this functionality, and doing so will probably need a different approach since Prometheus is not provided by default in vanilla Kubernetes. In short, I don't want to spend time modifying the code if this PR will not be merged anyway. Thanks |
@gmeghnag labelling the feature as experimental is exactly what would make it merge even if we want to implement this also in upstream Thanks for the awesome work! |
@ardaguclu @tsmetana I've updated the PR, please let me know if anything else is needed! (: |
/retest-required |
@gmeghnag: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Thank you. I quickly skimmed over the current code but we need to iterate a couple of times to be ready for merge. I think, it would be safer to defer this to 4.19 as unfortunately I don't have time for 4.18. |
}, | ||
} | ||
|
||
cmd.Flags().BoolVarP(&o.allNamespaces, "all-namespaces", "A", o.allNamespaces, "If present, list the pvc usage across all namespaces. Namespace in current context is ignored even if specified with --namespace") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't -A
(aka --all-namespaces
) coming as built-in just like --namespace
and we don't have to define in here additionally?.
type persistentVolumeClaimInfo struct { | ||
Namespace string | ||
Name string | ||
UsagePercentage string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason this field is exported, although persistentVolumeClaimInfo
is not exported?
for _, promOutputDataResult := range promOutput.Data.Result { | ||
namespaceName := promOutputDataResult.Metric["namespace"] | ||
pvcName := promOutputDataResult.Metric["persistentvolumeclaim"] | ||
usagePercentage := promOutputDataResult.Value[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we know promOutputDataResult.Value
's length is greater than 1 and we don't panic?
The code makes use of a Prometheus query to implement the
oc adm top persistentvolumeclaims
and show usage statistics for bound persistentvolumeclaims, as follows:It supports the following flags: