-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ClusterPool-Inventory enhancement Status.Inventory #1687
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: abraverm The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still feel this is a bit of stretch storing status in the cluster.status.inventory
. Let's list down a couple of problems that culminated in this design
- Openstack PSI installations do not succeed at first but require repeated attempts to succeed
- I think we should mark ClusterDeploymentCustomization as
Invalid
only when json patch process fails. If the cluster installation fails after patching, we can maintain field calledLastFailedApplyTime
(Needs a better name) on customization and if the last install with this customization recently failed clusterpool controller should move on to the next customization. This way we are not stuck failing on single customization for a long time. TheLastFailedAppyTime
should be cleared when we update the customization.
- It might happen that for a specific clusterpool with its custom install config template, the customization will never work on the cloud - in short a bad patch
- Should be a no-op on the customization side and the user should be able to figure out the issue from the clusterdeployment condition. This is practically an issue with the install config. The clusterpool controller will ignore the customization till cooloff period is reached (calculated from
LastFailedApplyTime
) and try again.
- What if inventory resource is missing
- clusterdeployment should use default install config and add a status condition to the clusterpool resource.
@2uasimojo Any inputs from you here?
- For the first ClusterDeploymentCustomization that is available, it will use the patches in the `spec.installConfigPatches` field to update the default install config generated by clusterpool. The patches will be applied in the order listed in the inventory. It will also update the status with reference to the cluster deployment that is using it and update the `Available` condition to false. | ||
- Set the `spec.clusterPoolReference.ClusterDeploymentCustomizationRef` field in the ClusterDeployment with a reference to ClusterDeploymentCustomization CR. | ||
- For each reference it will do a GET to fetch ClusterDeploymentCustomization and check: | ||
- if the entry status is `Available` and `ClusterDeployment` reference is nil. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This statement looks ambiguous. Can we have something that mentions these fields are on clusterpool.status?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The status in the clusterpool.status.inventory
is checked before doing GET to fetch ClusterDeploymentCustomization
Add a field to `ClusterPool.Stauts` called `Inventory`. | ||
|
||
Inventory resources might be used by multiple ClusterPools and the results can vary. For example the resource might be broken due to mismatch ClusterPool specificiation or with the cloud environment. Inventory resource status is tracked in ClusterPool's `Status.Inventory.<resource name>` property, which contains the following: | ||
- `Version` - Inventory resoruce last seen version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `Version` - Inventory resoruce last seen version. | |
- `Version` - Inventory resource last seen version. |
When ClusterDeployment is deleted, ClusterPool controller will valdiate that `Status.Inventory` is correct and update inventory resource `Available` condition to `true` and `spec.clusterDeploymentRef` to `nil`. | ||
|
||
### `ClusterPool.Status.Inventory` | ||
Add a field to `ClusterPool.Stauts` called `Inventory`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a field to `ClusterPool.Stauts` called `Inventory`. | |
Add a field to `ClusterPool.Status` called `Inventory`. |
- For the first ClusterDeploymentCustomization that is available, it will use the patches in the `spec.installConfigPatches` field to update the default install config generated by clusterpool. The patches will be applied in the order listed in the inventory. | ||
- It will also update the status of all related resources: | ||
- The entry status in `ClusterPool.Status.Inventory` field with status `Reserved` and reference to `ClusterDeployment`. | ||
- Created ClusterDeployment `spec.ClusterPoolReference.ClusterDeploymentCustomizationRef` with reference to the inventory resource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Created ClusterDeployment `spec.ClusterPoolReference.ClusterDeploymentCustomizationRef` with reference to the inventory resource. | |
- Created ClusterDeployment will have field`spec.ClusterPoolReference.ClusterDeploymentCustomizationRef` populated with reference to the inventory resource. | |
When ClusterDeployment is deleted, ClusterPool controller will valdiate that `Status.Inventory` is correct and update inventory resource `Available` condition to `true` and `spec.clusterDeploymentRef` to `nil`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to mention that clusterpool.status.inventory
will be updated with a right status
I think that having a custom resource for cluster deployment customization is the right thing to do. The status need to be stored somewhere. So the only alternative I can think of is having another custom resource and controller that will handle this process.
The entire inventory or just a specific resource is missing? If inventory is defined, then install config must be customized because the user expects that the default install config is not good without customization. |
My initial thought is that, while I like the ideas behind these changes, I would like to see them adjusted to conform better to a) k8s best practices, and b) design principles elsewhere in the hive project. For example:
|
I'm starting from the end because it helps me build my claims:
Yes, I forgot to mention this in the enhancment, will do.
Sounds good to me
That what I think I did in the implementation (hash calculation of spec), see #1672. But the version is monitored by ClusterPool. Here I'm struggling with the request to move most of the monitoring to CDC. Version must be in ClusterPool, because user can delete and create the same CDC, in that case the attemtps will reset but the version is technically the same.
I liked @akhil-rane suggestions on using
This option is not user friendly as one will be in a weird position of trying to catch a window to update CDC, or going the overhead of disabling in all CPs first.
"this makes it tough to correlate and debug" - I agree but I believe that UX is more important and I think that this option corrolates better with "Operator" idea of reconciling.
I'm struggling with the idea of
I partially agree on this point. I don't like duplication of data like the state of
Agree |
@abraverm Do you intend to open a new PR with the design changes we discussed in slack? |
Yes, this PR design is fundamentally different from the one discussed in slack. |
x-ref: https://issues.redhat.com/browse/HIVE-1367