Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add health and availableNodes in the OpenSearchCluster status #655

Merged

Conversation

saketmht
Copy link
Contributor

@saketmht saketmht commented Oct 31, 2023

Feat #553

Changes -

  1. Add Health and NODES field in OpensearchCluster Status to indicate cluster health and available nodes respectively.

Example Output

sakmahto@kind-managed1 ~ $ k get os
NAME         HEALTH   NODES   VERSION   PHASE     AGE
opensearch   green    8       2.3.0     RUNNING   14m

Signed-off-by: saketmht <[email protected]>
Comment on lines 557 to 583
func (r *ClusterReconciler) UpdateClusterHealth() error {
health := util.GetClusterHealth(r.ctx, r.Client, r.instance)

err := retry.RetryOnConflict(retry.DefaultRetry, func() error {
if err := r.Get(r.ctx, client.ObjectKeyFromObject(r.instance), r.instance); err != nil {
return err
}
r.instance.Status.Health = health
return r.Status().Update(r.ctx, r.instance)
})

return err
}

func (r *ClusterReconciler) UpdateAvailableNodes() error {
availableNodes := util.GetAvailableOpenSearchNodes(r.ctx, r.Client, r.instance)

err := retry.RetryOnConflict(retry.DefaultRetry, func() error {
if err := r.Get(r.ctx, client.ObjectKeyFromObject(r.instance), r.instance); err != nil {
return err
}
r.instance.Status.AvailableNodes = availableNodes
return r.Status().Update(r.ctx, r.instance)
})

return err
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to have both status fields updated in one go, to reduce the number of API interactions.

osClient, err := CreateClientForCluster(ctx, k8sClient, cluster, nil)

if err != nil {
return opsterv1.OpenSearchUnknownHealth
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error should be logged at least, for debugging purposes. For the other places also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to log it because when the cluster is first coming up, it will always fail to create the client and the logs are filled with error messages. Same at other places as well, when the sts is just being created, it returns a few not found error

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe log it on debug level, then it should normally not appear as the default log level is info. And in case someone has a problem we can ask for debug logs.

Comment on lines 137 to 139
// Update the cluster health and available nodes in the status
result.CombineErr(r.UpdateClusterHealth())
result.CombineErr(r.UpdateAvailableNodes())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should happen as the last point in the reconcile function, after everything else

Signed-off-by: saketmht <[email protected]>
@saketmht
Copy link
Contributor Author

saketmht commented Nov 2, 2023

@swoehrl-mw Updated based on comments

@swoehrl-mw swoehrl-mw merged commit f4fdf27 into opensearch-project:main Nov 2, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add additionalPrinterColumns to CRD to get cluster health & availablenodes
2 participants