-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle env section in deployment yaml #34
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, this is obviously a big PR so it's taking me a little while to wrap my head around.
I've added a few comments and requests to changes mainly in the testing suite so far.
My overriding thought when reading this is that the work done here on optional
fields doesn't add any value and rather over-complicates the code. If a field is optional, we report an error when it doesn't exist, but actually, do we care? If the field isn't there, and we just set it to an empty string, the hash would change once the field exists again, therefore having the desired effect? Do you agree or have I missed something?
The benefit I see to knowing if fields are optional is that we wouldn't throw an error if a configmap/secret doesn't exist but all the fields we want from within it are optional, which we are not currently handling (and I'm not suggesting we do within this PR, I'll create an issue to follow up on that)
I've had some thoughts on simplifying the logic if we agree we don't need to care about optional fields.
If we remove the requirement to know if fields are optional or not, this whole PR could be massively simplified. getChildNamesByType
could be rewritten to return two map[string]map[string]struct{}
where the first map is the name of the configmap/secret and the second is the the fields from within the configmap/secret that are being requested.
Then this set of fields can be passed into getObjectMap[WithKeys]?
and the filtering of the fields could be done there instead, returning the object with a possible subset of the expected data fields.
(this also means we would only perform a Get
on each object once).
Then no changes are required to be made to hashing algorithm and I think the extra types you've added would no longer be needed either.
Let me know what you think.
I've made the specific code changes suggested and committed them. For the discussion on the overall architecture of the change, I've tried to respond with my thoughts to each point below. It's worth saying that for this PR, I tried to maintain the existing architecture that discovery and collection of ConfigMaps/Secrets happens at the point at which we scan the deployment (we "get" the whole ConfigMap/Secret and keep it in memory), and processing that collected data is done separately at a later stage (i.e. where the code calculates the hash value). This is why I record keys (and whether they're optional) in a new type, and then change the hashing code to handle the extra metadata. The fact that both
It's worth saying that I don't think the code reports an error when an optional field does not exist, in all cases. I think it only reports an error when an optional field does not exist because the entire ConfigMap/Secret does not exist.
I suspect that might be trivial to add; Just make
That is close to what I originally had before adding the handling of optional values (with the caveat that the hashing code was also changed to handle both whole objects and specified keys within objects).
If the hash code is going to maintain a simple interface and only accept an array of one type of thing, I think that hashing code has to change either way: the current code expects to find an I'm thinking that either the hashing code changes to handle key-specific entries, or the ConfigMaps/Secrets discovery code changes to create "spoof" Objects. I chose the first as it seemed more explicit on what it was doing. Coming back to this after a couple of days, I suspect/agree that you're right that the That then still gives us an array of The hash code would still need to change to accommodate specific keys. Does my proposed simplification make sense? I think it's the same as yours but with the hash code changing slightly (is it a key, get the key and stringify it, if not stringify the whole thing). |
I've thought about this a bit more over night and spotted a problem with the proposal to simplify, and to only record an optional field when it is present. Consider the following case:
When I call Given that I think that then goes back to your original idea that: if we record the data at the point of scanning, and for missing optional fields we set the data to an empty string then we can handle that case. But I still think that route involves changing the hash function, and either modifying or build-a-fake-ing an |
I keep re-reading this and thinking about the best approach and I think I'm coming around to where we have got with this already, or at least some middle ground between our two initial suggestions. Do you think this would work:
When parsing the deployment,
In this idea, we solve the problem of optional config objects that don't exist; all config objects that exist and are referenced (optionally or not) will get owner refs and the owner ref code won't be changed.
I suggest we don't throw an error if required fields don't exist, in fact, my suggestion doesn't track which fields are optional or not at all. The reason for this being that, if a field is required or not does not affect us calculating a hash, if the field becomes present at a later time then the hash will change and Wave's responsibility is still fulfilled. It is not Wave's responsibility to track whether fields are optional or not, it is the deployment controller's. Equally, if a field is non-optional, and doesn't exist, then the deployment will go into a bad state anyway, so whether Wave is adding a hash or not doesn't particularly matter. I guess to summarise my point here is, required fields not being present won't prevent the hash changing when the configmaps or secrets change, so why does Wave care? |
Yes, this started to dawn on me halfway through reading point 3 :) From an initial read through, I think your plan above works. The one thing I'm not clear of is the function interfaces at all points in the chain, as there is a description above for the interface for I think that I still need the list returned by I was trying to think how to have both Would something like this be okay? In // This is the main interface for what comes out of children.go,
// such that owner_references.go and hash.go act on a list of these
type configObject struct {
k8sObject Object
required bool
allKeys bool
keys map[string]struct{}
} In // Interface for results from getConfigMap()/getSecret()
type getResult struct {
err error
obj Object
required bool
allKeys bool
keys map[string]struct{}
}
// Interface for results from getChildNamesByType()
type configMetadata struct {
required bool
allKeys bool
keys map[string]struct{}
}
func (h *Handler) getCurrentChildren(obj *appsv1.Deployment) ([]configObject, error) {
// getChildNamesByType() to get two maps
// loop through each map, calling getConfigMap()/getSecret() on a thread
// Wait for each "get" call and add result to a list of configObject
// return list of configObject
}
func getChildNamesByType(obj *appsv1.Deployment) (map[string]configMetadata, map[string]configMetadata) {
// Walk through deployment creating two consolidated maps (keyed on ConfigMap/Secret name)
// Each map lists the metadata for that ConfigMap/Secret in an instance of configMetadata.
}
// There is only one of these, so no "WithKeys" version any more.
func (h *Handler) getConfigMap(namespace, name string, required allKeys bool, keys map[string]struct{}) getResult {
...
}
// There is only one of these, so no "WithKeys" version any more.
func (h *Handler) getSecret(namespace, name string, required allKeys bool, keys map[string]struct{}) getResult {
...
}
I'd like to give this a day in the back of my brain, in case I spot anything else, but if not then are you okay if I start coding to the interfaces in this comment? |
Yep, the above looks pretty good to me, just a couple of comments on it:
Yep agreed, the passing around will be similar to how it is currently implemented. Any reason not to pass the
I really dislike the Thanks for working with me on this design btw, I appreciate your efforts |
No worries; if the result is better code then I'm all for it! Rolling in your latest comment, I'll go with this: In type configObject struct {
object Object
required bool
allKeys bool
keys map[string]struct{}
} In type getResult struct {
err error
obj Object
required bool
allKeys bool
keys map[string]struct{}
}
// Interface for results from getChildNamesByType()
type configMetadata struct {
required bool
allKeys bool
keys map[string]struct{}
}
func (h *Handler) getCurrentChildren(obj *appsv1.Deployment) ([]configObject, error) {
// getChildNamesByType() to get two maps
// loop through each map, calling getConfigMap()/getSecret() on a thread
// Wait for each "get" call and add result to a list of configObject
// return list of configObject
}
func getChildNamesByType(obj *appsv1.Deployment) (map[string]configMetadata, map[string]configMetadata) {
// Walk through deployment creating two consolidated maps (keyed on ConfigMap/Secret name)
// Each map lists the metadata for that ConfigMap/Secret in an instance of configMetadata.
}
func (h *Handler) getConfigMap(namespace, name string, metadata configMetadata) getResult {
...
}
func (h *Handler) getSecret(namespace, name string, metadata configMetadata) getResult {
...
}
|
Sounds great, thanks 👍 |
All done, I think. I had to use pointers to I've built the image locally and run it through a few scenarios as well, testing in minikube, and it seems to work. I've not done any load testing though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking a lot better and in general I'm pretty happy with this now.
I did however have an idea to remove the pointers to configMetadata
that might make things a bit simpler and I've requested you break things into different methods in a couple of places to reduce nesting and make the parent methods easier to read, hope that's ok, let me know if you have any questions
Code updated in line with review. I've done my usual manual testing in minikube and the one case that doesn't work is where a field in a ConfigMap is optional, and the whole map does not exist (so we cannot attach an owner). If I then create that ConfigMap, Wave does not trigger calculation of the hash, since the new ConfigMap has no owner. This feels like an edge case, and I'm not sure how to "fix" it without a complete rearchitecture of how Wave works... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of superfluous casts to deal with and then I'm happy with this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Thanks for all of your work on this @andyedwardsdfdl
Changes made:
Object
, a flag for whether this entry is only "single-field" env entries, and a list of referenced keys (map of key-name to a ConfigField)env
section, returning two new maps (one for ConfigMaps and one for Secrets), of maps (the second key being the key-name in the ConfigMap/Secret being accessed). This code also handles the optional field (see below for more fun on optional).handler.go
as the general flow hasn't changed, but I added some extra tests. Note while adding tests, I think I found a bug. There are several tests that check that the hash has changed, but the test code does this by checking that it does not find the original hash in an annotation, and I think it currently looks in the wrong annotation (and as such, the test would never fail). For example, https://github.com/pusher/wave/blob/master/pkg/core/handler_test.go#L229 callsm.Eventually(deployment, timeout).ShouldNot(utils.WithAnnotations(HaveKeyWithValue(ConfigHashAnnotation, originalHash)))
, but I think it should callm.Eventually(deployment, timeout).ShouldNot(utils.WithPodTemplateAnnotations(HaveKeyWithValue(ConfigHashAnnotation, originalHash)))
. Please correct me if I've missed something!Testing done:
New unit tests added and run
Docker image built and tested in a minikube environment.
env
section, only that pod is redeployed when only that value changes.env
section, only that pod is redeployed when only that value changes.env
section, only that pod is redeployed when that value is first added or when it changes.env
section, only that pod is redeployed when that value is first added or when it changes.volume
andenvFrom
is maintained: any change to referenced ConfigMaps/Secrets triggers a redeploy of only those pods that reference them.env
section that contains two references to the same field in a ConfigMap/Secret but with different names successfully redeploys when that value changes (yep, we've got this in our system - two different squads with different names for the same configuration parameter!)I've not done any soak-testing or any heavy load testing so I hopefully haven't introduced any memory leaks.
What is not in this PR:
I have not added anything that handles the
optional
flag onenvFrom
. In that case, you can say whether the entire ConfigMap is allowed to be missing or not, but technically I was only working on theenv
section :)There is an edge case that I have not solved that relates to the
optional
flag on entries in theenv
section, partly because I'm not sure how. Image that the following are all true:env
sectionIn this case, when I deploy, we attempt to look up the field and get an error (e.g.
E0114 17:54:50.356820 1 :0] kubebuilder/controller "msg"="Reconciler error" "error"="error fetching current children: error(s) encountered when geting children: ConfigMap \"config-optional\" not found" "Controller"="deployment-controller" "Request"={"Namespace":"default","Name":"charlie"}
. But k8s thinks this is a valid deployment file, so honours the deployment and the pod starts up anyway, just without the missing config being defined.If I then create that ConfigMap, Wave takes some time to notice. The problem is that it is relying on k8s backoff to "re-kick" the Wave handler after some timeout, at which point it successfully discovers the ConfigMap/Secret, and generates the hash. Ideally, it would notice instantly, as with any other config change, but since the ConfigMap/Secret did not exist we never got to register "this" deployment as an owner. Overal, it seems to work and "sort itself out in the end", but I'm just nervous about errors in the log but an apparently working system.