Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Late initialization overrides desired value when it is zero-value #105

Closed
muvaf opened this issue Jan 6, 2020 · 13 comments
Closed

Late initialization overrides desired value when it is zero-value #105

muvaf opened this issue Jan 6, 2020 · 13 comments
Assignees
Labels
bug Something isn't working stale wontfix This will not be worked on

Comments

@muvaf
Copy link
Member

muvaf commented Jan 6, 2020

What happened?

We have implemented late-initialization pattern across the resources. However, it relies upon the field's value being nil or zero-value. Example late-init functions from stack-gcp:

// LateInitializeInt64 implements late initialization for int64 type.
func LateInitializeInt64(i *int64, from int64) *int64 {
	if i != nil || from == 0 {
		return i
	}
	return &from
}

// LateInitializeBool implements late initialization for bool type.
func LateInitializeBool(b *bool, from bool) *bool {
	if b != nil || !from {
		return b
	}
	return &from
}

This implementation is buggy because it overrides when the user actually wants the zero-value for that field. For example, the value of field foo *bool is true in both spec and cloud provider and the user wants to make it false. In that case LateInitializeBool will override that value with true because it will assume that zero-value is a case where user doesn't have an opinion about that field and the value from cloud provider will be used. This seems easy to fix for pointer types by converting the function into:

// LateInitializeBool implements late initialization for bool type.
func LateInitializeBool(b *bool, from bool) *bool {
	if b != nil {
		return b
	}
	return &from
}

But there is a catch. We follow Kubernetes pattern for optional fields in our API design, which means that optional fields have the following jsontag:

// +optional
foo *string `json:"foo,omitempty"`

The tag omitempty actually causes the value to be nil in the Go code if assigned value is zero-value, in this case false. So, it comes as nil even if user wanted it to be false. This works for Kubernetes late-inited fields like pod.spec.nodeName because those fields will be eventually filled but that's not the case for us. The Go zero-value for a field could actually be the final value that the user wants for that field.

I'd like us to discuss not using omitempty tag at all, so that zero-values come to the controller as is instead of being converted to nil. In terms of api-server validation, nothing changes as // +optional tag is used to mark the fields as required or not in the CRD. Though not using omitempty is divergence from the optional/required design pattern that Kubernetes suggests.

@muvaf muvaf added the bug Something isn't working label Jan 6, 2020
@muvaf
Copy link
Member Author

muvaf commented Jan 6, 2020

/cc @negz @hasheddan

@hasheddan
Copy link
Member

This is tough because I agree that the solution here is just not using omitempty at all, but that would be a pretty big divergence. However, I feel that the way it is used even in Kubernetes native types is a bit of an anti-pattern. If I understand correctly, our code would not have to change except for literally removing the omitempty tags because we still want to handle nil optional values in the same manner, but right now we are being given nil values in some cases where we do not intend them (i.e. providing a zero-values explicitly to a pointer type). Is this your understanding as well?

@negz
Copy link
Member

negz commented Jan 25, 2020

It doesn't sound like omitting omitempty goes against conventions:

In most cases, optional fields should also have the omitempty struct tag (the omitempty option specifies that the field should be omitted from the json encoding if the field has an empty value). However, If you want to have different logic for an optional field which is not provided vs. provided with empty values, do not use omitempty (e.g. kubernetes/kubernetes#34641).

@hasheddan
Copy link
Member

hasheddan commented Jan 28, 2020

@muvaf @negz do we feel good about removing omitempty across the board? If so, I would like to go ahead and move on this while implementing more v1beta1 resources.

@negz
Copy link
Member

negz commented Jan 29, 2020

@hasheddan Are you confident it won't have any negative side-effects/behaviour changes? If so I'm okay with it.

@muvaf
Copy link
Member Author

muvaf commented Jan 29, 2020

I think I'll need to do some experiments to make sure we're able to capture all three: nil value, zero value, actual value. Or @hasheddan could also play with it and report here.

@hasheddan
Copy link
Member

@muvaf yep I'll pull some of our real-world examples and show how the change would impact 👍

@hasheddan hasheddan self-assigned this Jan 29, 2020
@negz
Copy link
Member

negz commented Feb 7, 2020

kubernetes-sigs/cluster-api#707

Some more prior art here. Having swapped back in the context of this issue I think we're on the right path. I'm not sure if we want to completely remove omitempty everywhere, but we should remove it for fields for which we want to distinguish between:

  1. Spec author has no opinion, propagate the default value back from the cloud.
  2. Spec author wants the field to be the zero value.
  3. Spec author wants the field to be a non-zero value.

So we'd represent these cases (using an *int field as example) as:

  1. Spec author omits the field, which gets serialized as a nil pointer.
  2. Spec author writes 0, which gets serialised as a pointer to int 0.
  3. Spec author writes 8, which gets serialised as a pointer to int 8.

@negz
Copy link
Member

negz commented Feb 7, 2020

https://crossplane.slack.com/archives/CKXQHM7U3/p1581113468101800
https://play.golang.org/p/NnWsj9BaTeJ
https://play.golang.org/p/oxQgQxtWK62

Cross-linking some discussion from Slack. I actually now think our late init code is working as expected a little broken when it comes to late-initializing an unspecified spec field from a cloud provider's zero value. 🤔 I believe:

func LateInitializeBool(b *bool, from bool) *bool {
	if b != nil || !from {
		return b
	}
	return &from
}

In the above case b is the user provided value and from is the value we get back from the cloud provider. If we walk through some cases, assuming the following struct:

type Example struct {
    Field *bool `json:"field,omitempty"`
}
  1. The user writes field: false. This is unmarshalled as Field: *false. Therefore b != nil, so we return b, thus keeping the user provided value without ever considering what from (the cloud provider value) was set to.
  2. The user writes field: true. Same as above - it's not nil so we keep it.
  3. The user omits field from their spec. This is unmarshalled as Field: nil, so b == nil. We deem the user to have no opinion about the value of this field and proceed to late initialisation from the cloud provider value.
    1. If the provider value was true, we late initialize Field: *true.
    2. If the provider value was false, we return b, which is nil.

So actually in the case of *bool fields it seems like the alternative logic @muvaf proposed would actually be better:

func LateInitializeBool(b *bool, from bool) *bool {
	if b != nil {
		return b
	}
	return &from
}

With this logic if the cloud provider specified false and the user had no opinion (omitted the field) we'd late initialize the field to *false, rather than nil.

@negz
Copy link
Member

negz commented Feb 7, 2020

I think the two external issues relating to omitempty that were linked in this issue were red herrings, because they deal with fields that are slices (not pointer to slices).

@mogensen
Copy link

mogensen commented Jan 4, 2022

Another option would be to add a flag to the CRDs that disallow late init stuff?
We may need to validate that all late init fields has a value if this new flag is set.

This will allow us to have complete control using the Crossplane CRDs.

@stale
Copy link

stale bot commented Aug 13, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Aug 13, 2022
Copy link

github-actions bot commented Sep 5, 2024

Crossplane does not currently have enough maintainers to address every issue and pull request. This issue has been automatically marked as stale because it has had no activity in the last 90 days. It will be closed in 14 days if no further activity occurs. Leaving a comment starting with /fresh will mark this issue as not stale.

@github-actions github-actions bot added the stale label Sep 5, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

4 participants