-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hcp_vault_cluster resource changes for adding vault plugins #575
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple tiny comments plus a question about setVaultClusterResourceData
. Rest of the changes make sense to me and testing looks 👍
@@ -307,6 +308,8 @@ func updateVaultPublicEndpointObservabilityDataAndMVU(t *testing.T, in *inputT) | |||
resource.TestCheckResourceAttrSet(in.VaultClusterResourceName, "audit_log_config.0.datadog_api_key"), | |||
resource.TestCheckResourceAttr(in.VaultClusterResourceName, "audit_log_config.0.datadog_region", "us1"), | |||
resource.TestCheckResourceAttr(in.VaultClusterResourceName, "major_version_upgrade_config.0.upgrade_type", "MANUAL"), | |||
resource.TestCheckResourceAttr(in.VaultClusterResourceName, "vault_plugin.0.plugin_name", "venafi-pki-backend"), | |||
resource.TestCheckResourceAttr(in.VaultClusterResourceName, "vault_plugin.0.plugin_type", "SECRET"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed we're only asserting on the vault_plugin
field on the second update. Could we add more assertions for the other updates and this one?
Update 1: config 1 (one plugin added)
Update 2: config 2 (one new plugin added. idempotent plugin/add api called twice)
Update 3: config 1 (one plugin removed. )
Like on update 1 that the first plugin is added, update 2 the second plugin is added (I think rn it's only asserting on the venafi SECRET
one and not the oracle DATABASE
), and update 3 that resource.TestCheckNoResourceAttr
on the vault_plugin
that was removed (similar to how we do it for metrics/audit logs when they are removed) + the one that wasn't removed is still set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so that test case isn't included yet because there is only 1 customer managed plugin to work with. I did some hacky things to test 2 of them. Once more plugins are added, we can update the tests.
hafsa had the same confusion, sorry if this wasn't clear in pr description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh gotcha. Suggestion: Could we maybe have update 3 use a config with no plugins at all and then assert resource.TestCheckNoResourceAttr
so then acceptance tests run by anyone will test removal without needing the extra setup you did?
Description: "The type of the plugin - Valid options for plugin type - 'SECRET', 'AUTH', 'DATABASE'", | ||
Type: schema.TypeString, | ||
Required: true, | ||
ValidateDiagFunc: validateVaultPluginType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding comment to explain why vault plugin type is validated on both create & update & plugin name is only validated on update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also have a relative question in another comment on how to do plugin name validation on create
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can definitely update comments to make this more clear
answered question in other thread : )
return diag.FromErr(err) | ||
} | ||
|
||
newPluginConfig, diagErr := getPluginConfig(d, plugins.Plugins) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was a little unclear in start of why we are passing listed cluster plugins here to ensure that new plugin is part of cluster plugin. But then realized that listed cluster plugin can have all valid plugins which may be not registered by the cluster too (the response isRegistered will tell you which are part of the cluster yet). So this way you ensure that the new plugin is at least a right plugin name.
If my understanding is correct then can we highlight this point in a comment here + also add a comment on top of ListPlugins of its response having both registered/non-registered thus all plugins.
Thinking how we can have a similar plugin name validation on create time too. Does vault service has an API to expose the plugin name to UI for opting in/out on create? if yes, can we use that to validate here to make both flows similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I can add comment, that would be very helpful
Vault service only has plugin APIs on the cluster level, so a cluster resource must exist on vault-service end first. (which is why we cant do the validation).
The options I see are:
- Hard-coded validation. (personally not a fan , but can be ok if we aren't adding plugins frequently)
- new vault-service API to serve plugins from feature flag for the LD rule of that organization.
- What we are doing now where the terraform will fail when the plugin fails to add. (I believe the cluster will still be running and it should be doable for the user to update their terraform resource)
- Making new
vault_plugin
resource that is not nested inhcp_vault_cluster
, so it can fail independently of cluster creation and it requires anhcp_vault_cluster
resource
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- new vault-service API to serve plugins from feature flag for the LD rule of that organization.
I thought this already existed for HCP Portal but realizing may be portal is reading the values directly from FF?
- What we are doing now where the terraform will fail when the plugin fails to add. (I believe the cluster will still be running and it should be doable for the user to update their terraform resource)
Hmm does this cause a concern in terraform state where cluster exists though the plugins given in input of create did not apply as they failed due to name being invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like 3 (current) is probably alright for now given that's the behavior on the UI - like you have to have created the cluster first before you can add plugins, and if that flow fails, your cluster is still running. In the HCP TFP these two steps are just chained through a single create call.
it should be doable for the user to update their terraform resource
Yup, users should be able to update the state using terraform refresh
if it fails. (Not sure if you're aware, but if you wanted, you can test this and any real creates/updates in prod with your changes in local binary following this guide.)
I think long-term 2 or 4 would make more sense, 2 being an easier change and 4 being a more difficult one + maybe separating out other resources for consistency too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For 4 -> There was a FR in talks in another channel where the user was showing the desire to manage the state of configuration like observability, separate than cluster resource in terraform. So it may be that in some long term future we actually do 4 option for all major configurations.
For now, agree that this flow seems fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- new vault-service API to serve plugins from feature flag for the LD rule of that organization.
I thought this already existed for HCP Portal but realizing may be portal is reading the values directly from FF?
The HCP portal API does use the FF, but the API also requires a valid cluster_id (which is why it can't be called before create).
going to stick with #3 for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing all comments - approving since
- we all seem to agree that the late create validation is acceptable for now and we can revisit down the line
- I just have one remaining test suggestion that I think is worth doing to make sure removal is tested on every acceptance test without special setup, but imo not blocking
closing due to making new terraform |
🛠️ Description
The terraform provider for HCP vault will be updated so new optional stanzas will be added to the 'hcp_vault_cluster' terraform resource, so plugins can be specified to be added when using terraform to provision HCP Vault clusters.
A new 'vault_plugin' resource will be created, which can be specified as a subresource which can be optionally added. This will allow for a list of vault_plugin resources to be specified to indicate the plugins a user wants to install on their cluster.
There will be validation for valid plugin_type. Validation is not done for plugin_name. This is because valid plugin names are stored in Launch darkly, which would require a cloud-sdk dependency in the provider, which I feel like would be discouraged as this is a public repo.
🏗️ Acceptance tests
Output from acceptance testing:
During the acceptance test, I used two different plugin configs.
config1:
config 2:
Update 1: config 1 (one plugin added)
Update 2: config 2 (one new plugin added. idempotent plugin/add api called twice)
Update 3: config 1 (one plugin removed. )
The datadog API endpoints support that the plugins were being added/deleted at the correct intervals. I also confirmed on the database that the plugin_registration table matched the expected state.
Because only 1 customer managed plugin is available right now, new tests must be added in future when more plugins are available.