-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update prometheus-federator to use dependencies from k8s 1.32.X+ #141
base: main
Are you sure you want to change the base?
Conversation
3e56f44
to
dc30f01
Compare
cmd/prometheus-federator/main.go
Outdated
if os.Getenv("MANAGE_CRD_UPDATES") == "true" { | ||
updateCRDs = true | ||
} | ||
managedCRDs := common.ManagedCRDsFromRuntime(f.RuntimeOptions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simplified the management of CRDs here by reading from the runtime configuration file.
- We manage (aka update) helm-locker CRDs IFF we are running an embedded helm-locker
- We manage (aka update) helm-controller CRDs IFF we are running an embedded helm-controller
IMO this was an oversight from the original maintainers of the operator, that they blindly redeployed all CRDs even if we weren't running the matching embedded controller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can extend the tracker abstraction to customize the CRD objects before they are created, but I didn't see a need to do this yet, it becomes messy if there needs to be breaking changes to what fields are indicative of the CRD being managed by us or someone else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only major concern with the summary provided is, if I'm a user who:
- Wants to enable PromFeds
helm-controller
, but - NOT have prom-fed manage the CRDs
Is that still possible? Or are we drawing a line in the sand with this PR, so if you use PromFeds helm-controller then PromFed will always do the CRDs? In result, for k3s/rke2 users they should always disable helm-controller
?
I understand that doing this makes our compatibility story much more simple. So that's fine if that's the case, we just need to make sure that's clear in release notes (and values.yaml descriptions) probably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect it's probably rare that customers have multiple helm-controllers
for necessary performance reasons. So maybe it's not a big deal to eliminate the option to use the PromFed helm-controller
in an environment where another one exists?
And even then, maybe the correct solution is that customers with helm-controller
external to PromFed needing 2, should just run 2 external ones of the same version.
Since that's much more simple than having to consider a crazy huge compatabiity matrix of what versions of PromFed work with what versions of helm-controller
(and there by versions of k3s/rke2).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added back the crd flags you introduced in the charts, and they work the same as they did before for compatibility reasons.
The main difference is that we discourage the use of those flags now, since helm-controller should be disabled in rke2/k3s and we don't manage crds for things we don't run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The managed-by annotation doesn't work as you would think it does -- whichever helm controller picks up the object first controls the object but it was never fleshed out beyond hardening currently picked up resources
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that annotation was a stepping stone towards allowing for concurrent helm-controllers/ helm-lockers / helm-project-operators but you can set up some tests to see it won't behave consistently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 Yeah I have studied the code and tested it many times so I'm confused about the disconnect. Maybe it would help if we pair on this tomorrow and I can show how it works in my lab. As far as proving this via testing, I didn't think that was the scope of the discussion. I'm open to us trying to cover that in e2e testing...but also feels tangential to my current point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh - I may see where I'm confused now. The ground work is there but the locks may be blocking that functionality. I'll sync up with @jbiers on this to see what we can sort out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really don't want to make it a non-singleton since rancher-monitoring
is a singleton, and its companion app prometheus-federator
is also a singleton
Users have been using it in this manner for a while, and the edge cases where we want a non-singleton are only clear to me if we use the underlying helm-project-operator to support multiple charts which is IMO not likely to happen at all since it isn't flexible and the tenancy around projects and specifically the helm-project-operator is hacky at best
0c2cdc4
to
2babdd2
Compare
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
slightly rework crd management Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
run go mod tidy Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
ccd3fcd
to
50a263a
Compare
func Data(filename string) []byte { | ||
data, err := fs.ReadFile(dataFs, path.Join("build", filename)) | ||
if err != nil { | ||
panic(err) | ||
} | ||
return data | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intentionally making this public in a public package so we can import a specific version of prometheus-federator and see the charts vendored within it.
It doesn't impact the immutability of the charts build into the binary, but allows alternate visibility into what chart is built into what commit of prometheus federator without having to deploy an image.
I'm open to reverting this improvement since it isn't necessary but thought it might be a useful change long term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See pkg/chart/data.go
, but I think the overhead of ~12Kb files with hopefully small diffs aside from major version bumps of project-monitoring should not impact the usual git operations too much based on my experience...
Especially since we plan on removing the chart hosting to a separate repository, I think the net impact will still be negative on the size/ amount of objects managed by git
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ehh. I'll have to digest this for a bit. My instinct is that I'll probably revert this with chart-split PR anyway. But maybe I can find a mental model where this makes sense.
Big picture the expectation we should have with rancher-project-monitring
is that anytime rancher-monitoring
changes, then rancher-porject-monitoring
will automatically be changing too.
And if we ship rancher-monitoring
we should also be shipping PromFed+RPM at the same cycle. This way we directly eliminate the lag between the two which causes us to use a mixture of images and generally skews CVE numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When shipping an image or application logic it is more useful to see and stage the diffs and manually update them or open an automation PR with the diffs visible
Having the go:embed files ignored is bad practice IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I don't think automation will ever get to the point where rebasing monitoring will out of the box rebase project-monitoring since the resources tend to change alot between 5-10 major versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The point of this is not to get in the way of any chart build or automation improvements but to have clear tracking of what gets built into the applications)
Signed-off-by: Alexandre Lamarre <[email protected]>
# FIXME: causes a git diff because we add git tagged versions to the helm-chart metadata. I think we can omit this | ||
# from builds since the helm-project-operator is a sub-chart and end-users shouldn't look to | ||
# this metadata for clarifying information since the version of the prometheus-federator version chart is the one | ||
# indicating changes from now on | ||
# edit-charts "./build/charts/${BUILD_TARGET}/Chart.yaml" "${HELM_CHART_VERSION}" "${HELM_IMAGE_TAG}" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% about this, but I believe it is better to edit the version values of the embedded helm-charts before building the images (in a separate commit/PR) for clarity.
The prometheus-federator chart itself can have sed/perl auto-inject versions when building though, that I don't mind
# FIXME: causes a git diff because we add git tagged versions to the helm-chart metadata. I think we can omit this | ||
# from builds since the helm-project-operator is a sub-chart and end-users shouldn't look to | ||
# this metadata for clarifying information since the version of the prometheus-federator version chart is the one | ||
# indicating changes from now on | ||
|
||
# edit-charts "build/charts/${CHART}/Chart.yaml" "${HELM_CHART_VERSION}" "${HELM_IMAGE_TAG}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% about this, but I believe it is better to edit the version values of the embedded helm-charts before building the images (in a separate commit/PR) for clarity.
The prometheus-federator chart itself can have sed/perl auto-inject versions when building though, that I don't mind
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
Signed-off-by: Alexandre Lamarre <[email protected]>
95f774f
to
ec2aadc
Compare
Signed-off-by: Alexandre Lamarre <[email protected]>
github.com/onsi/ginkgo/v2 v2.22.2 | ||
github.com/onsi/gomega v1.36.2 | ||
github.com/rancher/lasso v0.0.0-20241202185148-04649f379358 | ||
github.com/rancher/wrangler/v3 v3.2.0-rc.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, this needs to use rc.2
and the go version thing will go back to .0
@@ -190,6 +189,9 @@ func Register(ctx context.Context, systemNamespace string, cfg clientcmd.ClientC | |||
chart.Register(ctx, | |||
systemNamespace, | |||
opts.ControllerName, | |||
// this has to be cluster-admin for k3s reasons | |||
"cluster-admin", | |||
"6443", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also bumping the embedded helm controller requires a reference to the api server port. Since the embedded helm-controller runs within k3s it has access to the apiserver port on localhost. I doubt it works as intended when its run embedded and I didn't have the bandwidth to test that.
github.com/caarlos0/env/v11 v11.1.0 | ||
github.com/google/uuid v1.6.0 | ||
github.com/hashicorp/go-multierror v1.1.1 | ||
github.com/k3s-io/helm-controller v0.16.6-0.20241210112214-b40937ee695b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be v0.16.5 as that matches the version used by the stable release of k8s 1.32 for k3s/rke2.
BTW - I just realized that since this is targeting 1.32 we should ideally wait to ship this along with Rancher 2.11 that officially supports 2.11. And we're using RC libraries intended for Rancher 2.11 which feels wrong to include this soon. Because we don't do branching right now we can't merge this into main and expect it to not affect the February release cycle. So what if you changed this to target 1.31 and Rancher 2.10 based libraries that way it doesn't have to be something we only do in 2.11. Otherwise we have to some how implement branch strategy and do partial reverts before the February release cycle. |
@mallardduck The intention was always to bump dependencies as aggressively as possible and branch off the main branch, and downgrade dependencies selectively |
@alexandreLamarre - I guess that wasn't super clear. Either way works in the end I guess. However after merging this we need to be careful about what tag is used to test main. If i.e. After merging this the tag for main becomes I think my misunderstanding there also may mean my opinion in change ordering will be different here. Specifically I think we do need to sort out my effort to split charts+image code first. Since having that complete will make us more prepared for the branching once implemented here. |
No description provided.