Skip to content
This repository has been archived by the owner on May 6, 2022. It is now read-only.

Use generation instead of checksum for Instances and InstanceCredentials #1151

Merged

Conversation

staebler
Copy link
Contributor

With #1145, this completes the work for replacing checksums with generations as detailed in #1095.

Once this and #1145 are merged, we can delete the pkg/apis/servicecatalog/checksum directory.

There is some commented-out code for dealing with incrementing the generation on spec changes. We currently do not support any spec changes for Instances or InstanceCredentials. However, we will have some spec changes very soon for supporting originating identity headers.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 22, 2017
@@ -555,6 +553,11 @@ func setServiceInstanceCredentialConditionInternal(toUpdate *v1alpha1.ServiceIns

glog.V(5).Infof("Setting ServiceInstanceCredential '%v/%v' condition %q to %v", toUpdate.Namespace, toUpdate.Name, conditionType, status)

// Set status.ReconciledGeneration if updating ready condition to true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the controller should set this at the site where it determines that the work is done. I realize you were following my example from #1145, but that is not a good example :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that I am following what you are suggesting here. I moved the code to set the ReconciledGeneration out of the setXXXCondition functions and into the place where the setXXXCondition functions were being called. I hope that is what you were looking for.

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Aug 28, 2017
@pmorie
Copy link
Contributor

pmorie commented Aug 29, 2017

@staebler needs rebase

@staebler staebler force-pushed the 1095-generation-in-checksum-out branch from f0a5559 to 7b17f40 Compare August 29, 2017 19:23
@pmorie
Copy link
Contributor

pmorie commented Aug 29, 2017

This PR implements parity with what we currently have in the controller using checksum. As a follow-up, we will need to do #1157

Copy link
Contributor

@pmorie pmorie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, some nits.

@@ -320,6 +318,11 @@ func (c *controller) reconcileServiceInstanceCredential(binding *v1alpha1.Servic
c.recorder.Event(binding, api.EventTypeWarning, errorInjectingBindResultReason, s)
return err
}

// Create/Update for InstanceCredential has completed successful, so set Status.ReconciledGeneration to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you wrap this comment at 80 columns?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update is not applicable here - how about 'The bind operation completed successfully"

@@ -889,9 +888,6 @@ func TestReconcileServiceInstanceCredentialDeleteFailedServiceInstanceCredential
binding.ObjectMeta.DeletionTimestamp = &metav1.Time{}
binding.ObjectMeta.Finalizers = []string{v1alpha1.FinalizerServiceCatalog}

checksum := checksum.ServiceInstanceCredentialSpecChecksum(binding.Spec)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need to set generation and reconciled generation here, or are they both zero? I would prefer to explicitly set them both to one.

//
// We only do this if the deletion timestamp is nil, because the deletion
// timestamp changes the object's state in a way that we must reconcile,
// but does not affect the checksum.
// but does not affect the generation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you establish whether the generation/userinfo are bumped during a graceful-delete?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generation is not bumped during a delete. PrepareForUpdate is not called.

@@ -718,6 +721,10 @@ func (c *controller) pollServiceInstance(serviceClass *v1alpha1.ServiceClass, se
c.recorder.Event(instance, api.EventTypeNormal, successDeprovisionReason, successDeprovisionMessage)
glog.V(5).Infof("Successfully deprovisioned ServiceInstance %v/%v of ServiceClass %v at ServiceBroker %v", instance.Namespace, instance.Name, serviceClass.Name, brokerName)
} else {
// Create/Update for InstanceCredential has completed successful, so set Status.ReconciledGeneration to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrap comments at 80 chars

@@ -19,6 +19,7 @@ package instance
// this was copied from where else and edited to fit our objects

import (
//apiequality "k8s.io/apimachinery/pkg/api/equality"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little surprised that one of the many nag-tools we run on this repo didn't barf on this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment to explain why this is commented out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have uncommented the code that compares the old and new spec, even though it will always evaluate to true.

Copy link
Contributor

@pmorie pmorie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, 2 nits and looks like 1 unintentional change was committed.

@@ -319,8 +319,8 @@ func (c *controller) reconcileServiceInstanceCredential(binding *v1alpha1.Servic
return err
}

// Create/Update for InstanceCredential has completed successful, so set Status.ReconciledGeneration to the
// Generation used.
// The bind operation completed successful, so set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: successfully

@@ -545,8 +545,8 @@ func (c *controller) reconcileServiceInstance(instance *v1alpha1.ServiceInstance
} else {
glog.V(5).Infof("Successfully provisioned ServiceInstance %v/%v of ServiceClass %v at ServiceBroker %v: response: %+v", instance.Namespace, instance.Name, serviceClass.Name, brokerName, response)

// Create/Update for Instance has completed successful, so set Status.ReconciledGeneration to the
// Generation used.
// Create/Update for Instance has completed successful, so set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

successfully

@@ -114,6 +116,8 @@ func (instanceRESTStrategy) AllowUnconditionalUpdate() bool {
}

func (instanceRESTStrategy) PrepareForUpdate(ctx genericapirequest.Context, new, old runtime.Object) {
fmt.Printf("Instance PrepareForUpdate")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional?

@pmorie pmorie added the LGTM1 label Aug 30, 2017
// reconciled against the broker.
Checksum *string
// ReconciledGeneration is the generation of the instance that was last
// successfully reconciled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will it be clear to everyone that "reconciled" means "the controller is done doing its job regardless of whether it was success or not" ? Or will some people think it means "only successfully processed"? I kind of think we may want to make it clear for newbies:

// ReconciledGeneration is the 'Generation' of the instanceSpec that was last
// processed by the controller - regardless of whether the processing was
// success or not.

Copy link
Contributor

@arschles arschles Aug 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be clear. If someone is in the codebase, they should understand what reconciliation is. Particularly in this case, that it means that processing may not have succeeded from the end-user's point of view

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like @duglin's comment here. Status is for users, not just programmers. Ultimately we'll be generating docs from these comments, so we should make sure things are clear to users. I imagine this will also be more clear once we have a status field to show the current state that the broker knows about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have been more clear, I think "successfully reconciled" will be ambiguous.

// reconciled against the broker.
Checksum *string
// ReconciledGeneration is the generation of the ServiceInstanceCredential that was last
// successfully reconciled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

// reconciled against the broker.
Checksum *string `json:"checksum,omitempty"`
// ReconciledGeneration is the generation of the instance that was last
// successfully reconciled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Checksum *string `json:"checksum,omitempty"`
// ReconciledGeneration is the generation of the ServiceInstanceCredential that was last
// successfully reconciled.
ReconciledGeneration int64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto**2

bindingChecksum := checksum.ServiceInstanceCredentialSpecChecksum(binding.Spec)
if bindingChecksum == *binding.Status.Checksum {
// but does not affect the generation.
if binding.Status.ReconciledGeneration != 0 && binding.DeletionTimestamp == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it matter if ReconciledGeneration is zero? Seems like we should only care about the deletion timestamp, no?

@@ -169,7 +168,7 @@ func (c *controller) reconcileServiceInstanceDelete(instance *v1alpha1.ServiceIn
//
// TODO: the above logic changes slightly once we handle orphan
// mitigation.
if !instance.Status.AsyncOpInProgress && instance.Status.Checksum == nil {
if !instance.Status.AsyncOpInProgress && instance.Status.ReconciledGeneration == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it being set to zero matter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ReconciledGeneration is 0, then this is a ServiceInstance that has not yet been provisioned. Because it has not been provisioned, there is no need to send a DeprovisionRequest to the broker.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, what if it was provisioned but we didn't get the response. I'm pretty sure we'll be in orphan mitigation stuff, but when we delete the object does it stop orphan mitigation or let it finish? Also, who added the finalizerToken and do we need to remove it? Just want to make sure we're not leaving stuff around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The finalizer token is added by the instance REST strategy in PrepareForCreate. As far as I can glean, deleting the finalizer indicates that service-catalog no longer has any use for the resource. If the controller does not delete the finalizer, than the resource will remain longer than it should as the registry will think that service-catalog is still using the resource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the existing code, the checksum is only set when the status of the resource is updated with a Ready condition of true. That happens only when the Controller receives a success response from the Broker. So, even under the existing code, there is the potential gap when the Broker accepted the ProvisionRequest but the Controller did not receive the success response. I agree that that is a possibility and deserves an issue. Perhaps that is something that will be covered by orphan mitigation. I do not think that it is within the scope of this issue, however.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: Ready condition - It is not my understanding that the ReconciledGeneration will get set if there was a communication error talking to the Broker. The ReconciledGeneration is only set when (1) the Broker responded with a success, (2) the Broker responded with a failure indicating that the request will not succeed, or (3) the retry duration has expired.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if we don't want to delete the finalizer because more stuff is going on then shouldn't the check be for Generation != ReconciledGeneration, not ReconciledGeneration !=0 ? It just seems to me that the zero vs non-zero value here isn't the important thing, but perhaps I'm still just confused :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once DeletionTimestamp is not nil, the Controller does not care at all about the comparison between Generation and ReconciledGeneration. The comparison between the two is only relevant in the context of Provision and UpdateInstance. We are not postponing the deletion of the finalizer because there are more Provision or UpdateInstance tasks to do. We are postponing the deletion of the finalizer because there is a Deprovision task to do. In the case where ReconciledGeneration is zero, there is no Deprovision task to do because the Provision task was never completed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The finalizer token is added by the instance REST strategy in PrepareForCreate

What if users add their own finalizers though? It's common for Kubernetes core, don't we support this in Service Catalog? Probably the more correct would be removing only "our" finalizers.

In the case where ReconciledGeneration is zero, there is no Deprovision task to do because the Provision task was never completed.

There could be a cleanup (orphan mitigation) stuff to do (as @duglin already pointed out), and I would at least leave a more clear TODO comment for this there, if it is out of scope of this PR.

P.S. I understand that these issues are already present in master, and you're just replacing checksum with generation. So these questions are out of scope of this PR, and not blocking it from being merged, agreed.

Copy link
Contributor Author

@staebler staebler Sep 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if users add their own finalizers though? It's common for Kubernetes core, don't we support this in Service Catalog? Probably the more correct would be removing only "our" finalizers.

The Controller is only removing the finalizer that Service Catalog added. It is leaving the other finalizers untouched on the resource.

if instance.Status.Checksum != nil && instance.DeletionTimestamp == nil {
instanceChecksum := checksum.ServiceInstanceSpecChecksum(instance.Spec)
if instanceChecksum == *instance.Status.Checksum {
if instance.Status.ReconciledGeneration != 0 && instance.DeletionTimestamp == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, not sure why we're checking ReconciledGeneration here.

glog.V(4).Infof(
"Not processing event for ServiceInstance %v/%v because checksum showed there is no work to do",
"Not processing event for ServiceInstance %v/%v because reconciled generation showed there is no work to do",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on each of the cases below where we stop processing and return err do we need to set ReconciledGeneration to Generation because we're stopped processing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Controller does not give up on processing the resource when we return err. The Controller will retry the reconciliation later. We only stop processing the resource when we return nil. There is no logic currently in the code to handle when we stop retrying. That will need to be done as part of the work for Limits of Work [1].

[1] #1157

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with leaving it for another PR, but it seems like there are errors that should cause a retry and errors that will never work and we know will never work. Not sure we should be trying in those latter cases.

Copy link
Contributor

@arschles arschles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM. I had a few nits and that's it.

@@ -812,10 +810,9 @@ func TestReconcileServiceInstanceDeleteAsynchronous(t *testing.T) {
instance := getTestServiceInstance()
instance.ObjectMeta.DeletionTimestamp = &metav1.Time{}
instance.ObjectMeta.Finalizers = []string{v1alpha1.FinalizerServiceCatalog}
// we only invoke the broker client to deprovision if we have a checksum set
// we only invoke the broker client to deprovision if we have a recondiled generation set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo. s/recondiled/reconciled

@@ -880,10 +877,9 @@ func TestReconcileServiceInstanceDeleteFailedInstance(t *testing.T) {
instance.ObjectMeta.DeletionTimestamp = &metav1.Time{}
instance.ObjectMeta.Finalizers = []string{v1alpha1.FinalizerServiceCatalog}

// we only invoke the broker client to deprovision if we have a checksum set
// we only invoke the broker client to deprovision if we have a recondiled generation set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/recondiled/reconciled

@@ -19,6 +19,7 @@ package binding
// this was copied from where else and edited to fit our objects

import (
//apiequality "k8s.io/apimachinery/pkg/api/equality"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you document why this is commented out?

@@ -19,6 +19,7 @@ package instance
// this was copied from where else and edited to fit our objects

import (
//apiequality "k8s.io/apimachinery/pkg/api/equality"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment to explain why this is commented out?

// proper validation of allowed changes needs to be implemented in
// ValidateUpdate. Also, the check for whether the generation needs
// to be updated needs to be un-commented.
newServiceInstanceCredential.Spec = oldServiceInstanceCredential.Spec
Copy link
Contributor

@nilebox nilebox Aug 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we throw some error if the spec has changed instead of silently ignoring any changes to the spec?
This behavior can be confusing for users, even though it's temporary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment here: #1151 (comment)

Copy link
Contributor

@nilebox nilebox Aug 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we can return errors in the ValidateUpdate method though (below).

	// ValidateUpdate is invoked after default fields in the object have been
	// filled in before the object is persisted.  This method should not mutate
	// the object.
	ValidateUpdate(ctx genericapirequest.Context, obj, old runtime.Object) field.ErrorList

Both new and old objects are passed there, so we can compare specs and return error if they are not identical.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elsewhere in Kubernetes, we ignore changes to fields that cannot be changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1182

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough

@pmorie
Copy link
Contributor

pmorie commented Aug 31, 2017 via email

@@ -122,14 +123,24 @@ func (instanceRESTStrategy) PrepareForUpdate(ctx genericapirequest.Context, new,
glog.Fatal("received a non-instance object to update from")
}

// Do not allow any updates to the Status field while updating the Spec
newServiceInstance.Status = oldServiceInstance.Status
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this added due to this PR or was it just an oversight that it wasn't here before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was moved a few lines up from where it was before: no functionality was changed. It was moved originally so that the line preventing the spec change immediately preceded the commented-out lines that were unnecessary because any spec changes are discarded. Since that code is no longer commented-out, then this code movement is not as important any more, but it is still not harmful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I missed that it was moved - sorry

@duglin
Copy link
Contributor

duglin commented Sep 1, 2017

LGTM

@duglin duglin added the LGTM2 label Sep 1, 2017
@duglin
Copy link
Contributor

duglin commented Sep 1, 2017

Wouldn't mind one more set of eyes on this one

@pmorie pmorie merged commit 28c0ae7 into kubernetes-retired:master Sep 5, 2017
@pmorie pmorie added this to the 0.0.18 milestone Sep 7, 2017
@staebler staebler deleted the 1095-generation-in-checksum-out branch September 7, 2017 20:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. LGTM1 LGTM2 size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants