Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hcp_vault_cluster resource changes for adding vault plugins #575

Closed
wants to merge 7 commits into from

Conversation

hashiblaum
Copy link
Member

@hashiblaum hashiblaum commented Aug 7, 2023

🛠️ Description

The terraform provider for HCP vault will be updated so new optional stanzas will be added to the 'hcp_vault_cluster' terraform resource, so plugins can be specified to be added when using terraform to provision HCP Vault clusters.

A new 'vault_plugin' resource will be created, which can be specified as a subresource which can be optionally added. This will allow for a list of vault_plugin resources to be specified to indicate the plugins a user wants to install on their cluster.

There will be validation for valid plugin_type. Validation is not done for plugin_name. This is because valid plugin names are stored in Launch darkly, which would require a cloud-sdk dependency in the provider, which I feel like would be discouraged as this is a public repo.

🏗️ Acceptance tests

  • Are there any feature flags that are required to use this functionality?
  • Have you added an acceptance test for the functionality being added?
  • Have you run the acceptance tests on this branch?

Output from acceptance testing:

During the acceptance test, I used two different plugin configs.

config1:

vault_plugin {
		plugin_type = "SECRET"
		plugin_name = "venafi-pki-backend"
	}

config 2:

	vault_plugin {
		plugin_type = "SECRET"
		plugin_name = "venafi-pki-backend"
	}
	vault_plugin {
		plugin_type = "DATABASE"
		plugin_name = "vault-plugin-database-oracle"
	}

Update 1: config 1 (one plugin added)
Update 2: config 2 (one new plugin added. idempotent plugin/add api called twice)
Update 3: config 1 (one plugin removed. )

The datadog API endpoints support that the plugins were being added/deleted at the correct intervals. I also confirmed on the database that the plugin_registration table matched the expected state.
Screenshot 2023-08-07 at 3 10 15 PM

Because only 1 customer managed plugin is available right now, new tests must be added in future when more plugins are available.

make testacc TESTARGS='-run=TestAccVaultClusterAWS'
==> Checking that code complies with gofmt requirements...
golangci-lint run --config ./golangci-config.yml 
TF_ACC=1 go test ./internal/... -v -run=TestAccVaultClusterAWS -timeout 360m -parallel=10
testing: warning: no tests to run
PASS
ok      github.com/hashicorp/terraform-provider-hcp/internal/clients    0.333s [no tests to run]
testing: warning: no tests to run
PASS
ok      github.com/hashicorp/terraform-provider-hcp/internal/consul     0.298s [no tests to run]
testing: warning: no tests to run
PASS
ok      github.com/hashicorp/terraform-provider-hcp/internal/input      0.288s [no tests to run]
=== RUN   TestAccVaultClusterAWS
=== PAUSE TestAccVaultClusterAWS
=== CONT  TestAccVaultClusterAWS
--- PASS: TestAccVaultClusterAWS (3173.02s)
PASS
ok      github.com/hashicorp/terraform-provider-hcp/internal/provider   3173.386s

...

@hashiblaum hashiblaum requested review from a team, tcrayford and himran92 August 7, 2023 19:13
@hashicorp-cla
Copy link

hashicorp-cla commented Aug 7, 2023

CLA assistant check
All committers have signed the CLA.

@hashiblaum hashiblaum requested a review from helenfufu August 7, 2023 19:14
Copy link
Contributor

@helenfufu helenfufu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple tiny comments plus a question about setVaultClusterResourceData. Rest of the changes make sense to me and testing looks 👍

docs/resources/vault_cluster.md Outdated Show resolved Hide resolved
internal/provider/resource_vault_cluster.go Outdated Show resolved Hide resolved
internal/provider/resource_vault_cluster.go Outdated Show resolved Hide resolved
@@ -307,6 +308,8 @@ func updateVaultPublicEndpointObservabilityDataAndMVU(t *testing.T, in *inputT)
resource.TestCheckResourceAttrSet(in.VaultClusterResourceName, "audit_log_config.0.datadog_api_key"),
resource.TestCheckResourceAttr(in.VaultClusterResourceName, "audit_log_config.0.datadog_region", "us1"),
resource.TestCheckResourceAttr(in.VaultClusterResourceName, "major_version_upgrade_config.0.upgrade_type", "MANUAL"),
resource.TestCheckResourceAttr(in.VaultClusterResourceName, "vault_plugin.0.plugin_name", "venafi-pki-backend"),
resource.TestCheckResourceAttr(in.VaultClusterResourceName, "vault_plugin.0.plugin_type", "SECRET"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed we're only asserting on the vault_plugin field on the second update. Could we add more assertions for the other updates and this one?

Update 1: config 1 (one plugin added)
Update 2: config 2 (one new plugin added. idempotent plugin/add api called twice)
Update 3: config 1 (one plugin removed. )

Like on update 1 that the first plugin is added, update 2 the second plugin is added (I think rn it's only asserting on the venafi SECRET one and not the oracle DATABASE), and update 3 that resource.TestCheckNoResourceAttr on the vault_plugin that was removed (similar to how we do it for metrics/audit logs when they are removed) + the one that wasn't removed is still set.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so that test case isn't included yet because there is only 1 customer managed plugin to work with. I did some hacky things to test 2 of them. Once more plugins are added, we can update the tests.

hafsa had the same confusion, sorry if this wasn't clear in pr description

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh gotcha. Suggestion: Could we maybe have update 3 use a config with no plugins at all and then assert resource.TestCheckNoResourceAttr so then acceptance tests run by anyone will test removal without needing the extra setup you did?

docs/data-sources/vault_cluster.md Show resolved Hide resolved
docs/resources/vault_cluster.md Show resolved Hide resolved
Description: "The type of the plugin - Valid options for plugin type - 'SECRET', 'AUTH', 'DATABASE'",
Type: schema.TypeString,
Required: true,
ValidateDiagFunc: validateVaultPluginType,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding comment to explain why vault plugin type is validated on both create & update & plugin name is only validated on update

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also have a relative question in another comment on how to do plugin name validation on create :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can definitely update comments to make this more clear

answered question in other thread : )

return diag.FromErr(err)
}

newPluginConfig, diagErr := getPluginConfig(d, plugins.Plugins)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a little unclear in start of why we are passing listed cluster plugins here to ensure that new plugin is part of cluster plugin. But then realized that listed cluster plugin can have all valid plugins which may be not registered by the cluster too (the response isRegistered will tell you which are part of the cluster yet). So this way you ensure that the new plugin is at least a right plugin name.
If my understanding is correct then can we highlight this point in a comment here + also add a comment on top of ListPlugins of its response having both registered/non-registered thus all plugins.

Thinking how we can have a similar plugin name validation on create time too. Does vault service has an API to expose the plugin name to UI for opting in/out on create? if yes, can we use that to validate here to make both flows similar?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I can add comment, that would be very helpful

Vault service only has plugin APIs on the cluster level, so a cluster resource must exist on vault-service end first. (which is why we cant do the validation).

The options I see are:

  1. Hard-coded validation. (personally not a fan , but can be ok if we aren't adding plugins frequently)
  2. new vault-service API to serve plugins from feature flag for the LD rule of that organization.
  3. What we are doing now where the terraform will fail when the plugin fails to add. (I believe the cluster will still be running and it should be doable for the user to update their terraform resource)
  4. Making new vault_plugin resource that is not nested in hcp_vault_cluster, so it can fail independently of cluster creation and it requires an hcp_vault_cluster resource

Copy link

@himran92 himran92 Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. new vault-service API to serve plugins from feature flag for the LD rule of that organization.

I thought this already existed for HCP Portal but realizing may be portal is reading the values directly from FF?

  1. What we are doing now where the terraform will fail when the plugin fails to add. (I believe the cluster will still be running and it should be doable for the user to update their terraform resource)

Hmm does this cause a concern in terraform state where cluster exists though the plugins given in input of create did not apply as they failed due to name being invalid.

Copy link
Contributor

@helenfufu helenfufu Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like 3 (current) is probably alright for now given that's the behavior on the UI - like you have to have created the cluster first before you can add plugins, and if that flow fails, your cluster is still running. In the HCP TFP these two steps are just chained through a single create call.

it should be doable for the user to update their terraform resource

Yup, users should be able to update the state using terraform refresh if it fails. (Not sure if you're aware, but if you wanted, you can test this and any real creates/updates in prod with your changes in local binary following this guide.)

I think long-term 2 or 4 would make more sense, 2 being an easier change and 4 being a more difficult one + maybe separating out other resources for consistency too.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 4 -> There was a FR in talks in another channel where the user was showing the desire to manage the state of configuration like observability, separate than cluster resource in terraform. So it may be that in some long term future we actually do 4 option for all major configurations.

For now, agree that this flow seems fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. new vault-service API to serve plugins from feature flag for the LD rule of that organization.

I thought this already existed for HCP Portal but realizing may be portal is reading the values directly from FF?

The HCP portal API does use the FF, but the API also requires a valid cluster_id (which is why it can't be called before create).

going to stick with #3 for now.

Copy link
Contributor

@helenfufu helenfufu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing all comments - approving since

  • we all seem to agree that the late create validation is acceptable for now and we can revisit down the line
  • I just have one remaining test suggestion that I think is worth doing to make sure removal is tested on every acceptance test without special setup, but imo not blocking

@hashiblaum hashiblaum closed this Aug 21, 2023
@hashiblaum
Copy link
Member Author

closing due to making new terraform hcp_vault_cluster resource in #579

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants