-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
azurerm_machine_learning_compute_cluster
- Add support for update identity
#26404
azurerm_machine_learning_compute_cluster
- Add support for update identity
#26404
Conversation
This feature is a must have for me. Today when I have a new user in a AAD group, I get this user to create an user managed identity that I assign to a list of compute clusters. And thus making a recreation of each compute clusters. And if the compute cluster where in use, it will interrumpt the Job/Schedule and recreate it. So we have to wait the end of the day, where people don't use compute clusters to deploy a new user on our platform. (The user managed identity is also used for other ressources like Databricks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xuzhang3. Would you mind looking at the comments left in-line?
return err | ||
} | ||
|
||
workspace, err := mlWorkspacesClient.Get(ctx, *workspaceID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to get the workspace here? Can't we retrieve all the info on the Compute Cluster by calling client.ComputeGet
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SKU used by the compute cluster resources is the SKU
of the workspace. And compute GET API will not return the SKU, but nil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a bug we should raise on the Rest API Spec, can you please open one?
future, err := client.ComputeCreateOrUpdate(ctx, *id, computeClusterParameters) | ||
if err != nil { | ||
return fmt.Errorf("creating %s: %+v", id, err) | ||
} | ||
if err := future.Poller.PollUntilDone(ctx); err != nil { | ||
return fmt.Errorf("waiting for creation of %s: %+v", id, err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
future, err := client.ComputeCreateOrUpdate(ctx, *id, computeClusterParameters) | |
if err != nil { | |
return fmt.Errorf("creating %s: %+v", id, err) | |
} | |
if err := future.Poller.PollUntilDone(ctx); err != nil { | |
return fmt.Errorf("waiting for creation of %s: %+v", id, err) | |
} | |
if err := client.ComputeCreateOrUpdateThenPoll(ctx, *id, computeClusterParameters); err != nil { | |
return fmt.Errorf("updating %s: %+v", id, err) | |
} |
return err | ||
} | ||
|
||
workspace, err := mlWorkspacesClient.Get(ctx, *workspaceID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a bug we should raise on the Rest API Spec, can you please open one?
vmPriority := machinelearningcomputes.VMPriority(d.Get("vm_priority").(string)) | ||
computeClusterAmlComputeProperties := machinelearningcomputes.AmlComputeProperties{ | ||
VMSize: utils.String(d.Get("vm_size").(string)), | ||
VMPriority: &vmPriority, | ||
ScaleSettings: expandScaleSettings(d.Get("scale_settings").([]interface{})), | ||
UserAccountCredentials: expandUserAccountCredentials(d.Get("ssh").([]interface{})), | ||
EnableNodePublicIP: pointer.To(d.Get("node_public_ip_enabled").(bool)), | ||
} | ||
|
||
computeClusterAmlComputeProperties.RemoteLoginPortPublicAccess = pointer.To(machinelearningcomputes.RemoteLoginPortPublicAccessDisabled) | ||
if d.Get("ssh_public_access_enabled").(bool) { | ||
computeClusterAmlComputeProperties.RemoteLoginPortPublicAccess = pointer.To(machinelearningcomputes.RemoteLoginPortPublicAccessEnabled) | ||
} | ||
|
||
if subnetId, ok := d.GetOk("subnet_resource_id"); ok && subnetId.(string) != "" { | ||
computeClusterAmlComputeProperties.Subnet = &machinelearningcomputes.ResourceId{Id: subnetId.(string)} | ||
} | ||
|
||
// NOTE: The 'AmlCompute' 'ComputeLocation' field should always point | ||
// to configuration files 'location' field... | ||
computeClusterProperties := machinelearningcomputes.AmlCompute{ | ||
Properties: &computeClusterAmlComputeProperties, | ||
ComputeLocation: utils.String(d.Get("location").(string)), | ||
Description: utils.String(d.Get("description").(string)), | ||
DisableLocalAuth: utils.Bool(!d.Get("local_auth_enabled").(bool)), | ||
} | ||
|
||
// NOTE: The 'ComputeResource' 'Location' field should always point | ||
// to the workspace's 'location'... | ||
computeClusterParameters := machinelearningcomputes.ComputeResource{ | ||
Properties: computeClusterProperties, | ||
Identity: identity, | ||
Location: workspaceModel.Location, | ||
Tags: tags.Expand(d.Get("tags").(map[string]interface{})), | ||
Sku: &machinelearningcomputes.Sku{ | ||
Name: workspaceModel.Sku.Name, | ||
Tier: pointer.To(machinelearningcomputes.SkuTier(*workspaceModel.Sku.Tier)), | ||
}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should still be able to retrieve the existing Compute Cluster and patch the SKU from the workspace into the model instead of having to set everything from the config like in the create?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or we can try update without SKU? if this works we don't need to get the workspace in update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, try it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried, can update without SKU. Also the API doc the SKU is the SKU for the workspace(https://learn.microsoft.com/en-us/rest/api/azureml/compute/create-or-update?view=rest-azureml-2024-04-01&tabs=HTTP#request-body). So we can ignore it or get it from the workspace, do we need to remove it in update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you can update successfully without sending anything for the SKU (and it doesn't change the SKU) then it's fine to omit getting it from the workspace.
This PR is expected to fix #25883 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look at the comment left in-line, furthermore the test isn't properly testing the update of identity
.
The first update is triggering a ForceNew
on the resource because the property local_auth_enabled
has changed. Can you please make sure the test configuration for this update test is not changing any ForceNew properties?
compute, err := client.ComputeGet(ctx, *id) | ||
if err != nil { | ||
return fmt.Errorf("retrieving %s: %+v", *id, err) | ||
} | ||
|
||
computeModel := compute.Model | ||
if computeModel == nil { | ||
return fmt.Errorf("retrieving %s: `model` was nil", *id) | ||
} | ||
|
||
identity, err := expandIdentity(d.Get("identity").([]interface{})) | ||
if err != nil { | ||
return fmt.Errorf("expanding `identity`: %+v", err) | ||
} | ||
|
||
vmPriority := machinelearningcomputes.VMPriority(d.Get("vm_priority").(string)) | ||
computeClusterAmlComputeProperties := machinelearningcomputes.AmlComputeProperties{ | ||
VMSize: utils.String(d.Get("vm_size").(string)), | ||
VMPriority: &vmPriority, | ||
ScaleSettings: expandScaleSettings(d.Get("scale_settings").([]interface{})), | ||
UserAccountCredentials: expandUserAccountCredentials(d.Get("ssh").([]interface{})), | ||
EnableNodePublicIP: pointer.To(d.Get("node_public_ip_enabled").(bool)), | ||
} | ||
|
||
computeClusterAmlComputeProperties.RemoteLoginPortPublicAccess = pointer.To(machinelearningcomputes.RemoteLoginPortPublicAccessDisabled) | ||
if d.Get("ssh_public_access_enabled").(bool) { | ||
computeClusterAmlComputeProperties.RemoteLoginPortPublicAccess = pointer.To(machinelearningcomputes.RemoteLoginPortPublicAccessEnabled) | ||
} | ||
|
||
if subnetId, ok := d.GetOk("subnet_resource_id"); ok && subnetId.(string) != "" { | ||
computeClusterAmlComputeProperties.Subnet = &machinelearningcomputes.ResourceId{Id: subnetId.(string)} | ||
} | ||
|
||
computeClusterProperties := machinelearningcomputes.AmlCompute{ | ||
Properties: &computeClusterAmlComputeProperties, | ||
ComputeLocation: utils.String(d.Get("location").(string)), | ||
Description: utils.String(d.Get("description").(string)), | ||
DisableLocalAuth: utils.Bool(!d.Get("local_auth_enabled").(bool)), | ||
} | ||
|
||
computeClusterParameters := machinelearningcomputes.ComputeResource{ | ||
Properties: computeClusterProperties, | ||
Identity: identity, | ||
Location: computeModel.Location, | ||
Tags: tags.Expand(d.Get("tags").(map[string]interface{})), | ||
} | ||
|
||
if err := client.ComputeCreateOrUpdateThenPoll(ctx, *id, computeClusterParameters); err != nil { | ||
return fmt.Errorf("updating %s: %+v", id, err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xuzhang3 my original comment hasn't been resolved here.
Like we do in many other updates, we get the existing resource, and patch in the changes to the existing resource's model and use that as the payload for the CreateOrUpdate call. This is explained in our guide for adding new resources in the Contributor Docs.
This whole block should be simplified to
compute, err := client.ComputeGet(ctx, *id) | |
if err != nil { | |
return fmt.Errorf("retrieving %s: %+v", *id, err) | |
} | |
computeModel := compute.Model | |
if computeModel == nil { | |
return fmt.Errorf("retrieving %s: `model` was nil", *id) | |
} | |
identity, err := expandIdentity(d.Get("identity").([]interface{})) | |
if err != nil { | |
return fmt.Errorf("expanding `identity`: %+v", err) | |
} | |
vmPriority := machinelearningcomputes.VMPriority(d.Get("vm_priority").(string)) | |
computeClusterAmlComputeProperties := machinelearningcomputes.AmlComputeProperties{ | |
VMSize: utils.String(d.Get("vm_size").(string)), | |
VMPriority: &vmPriority, | |
ScaleSettings: expandScaleSettings(d.Get("scale_settings").([]interface{})), | |
UserAccountCredentials: expandUserAccountCredentials(d.Get("ssh").([]interface{})), | |
EnableNodePublicIP: pointer.To(d.Get("node_public_ip_enabled").(bool)), | |
} | |
computeClusterAmlComputeProperties.RemoteLoginPortPublicAccess = pointer.To(machinelearningcomputes.RemoteLoginPortPublicAccessDisabled) | |
if d.Get("ssh_public_access_enabled").(bool) { | |
computeClusterAmlComputeProperties.RemoteLoginPortPublicAccess = pointer.To(machinelearningcomputes.RemoteLoginPortPublicAccessEnabled) | |
} | |
if subnetId, ok := d.GetOk("subnet_resource_id"); ok && subnetId.(string) != "" { | |
computeClusterAmlComputeProperties.Subnet = &machinelearningcomputes.ResourceId{Id: subnetId.(string)} | |
} | |
computeClusterProperties := machinelearningcomputes.AmlCompute{ | |
Properties: &computeClusterAmlComputeProperties, | |
ComputeLocation: utils.String(d.Get("location").(string)), | |
Description: utils.String(d.Get("description").(string)), | |
DisableLocalAuth: utils.Bool(!d.Get("local_auth_enabled").(bool)), | |
} | |
computeClusterParameters := machinelearningcomputes.ComputeResource{ | |
Properties: computeClusterProperties, | |
Identity: identity, | |
Location: computeModel.Location, | |
Tags: tags.Expand(d.Get("tags").(map[string]interface{})), | |
} | |
if err := client.ComputeCreateOrUpdateThenPoll(ctx, *id, computeClusterParameters); err != nil { | |
return fmt.Errorf("updating %s: %+v", id, err) | |
} | |
existing, err := client.ComputeGet(ctx, *id) | |
if err != nil { | |
return fmt.Errorf("retrieving %s: %+v", *id, err) | |
} | |
payload := existing.Model | |
if payload == nil { | |
return fmt.Errorf("retrieving %s: `model` was nil", *id) | |
} | |
identity, err := expandIdentity(d.Get("identity").([]interface{})) | |
if err != nil { | |
return fmt.Errorf("expanding `identity`: %+v", err) | |
} | |
payload.Identity = identity | |
if err := client.ComputeCreateOrUpdateThenPoll(ctx, *id, *payload); err != nil { | |
return fmt.Errorf("updating %s: %+v", id, err) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xuzhang3 my original comment hasn't been resolved here.
Like we do in many other updates, we get the existing resource, and patch in the changes to the existing resource's model and use that as the payload for the CreateOrUpdate call. This is explained in our guide for adding new resources in the Contributor Docs.
This whole block should be simplified to
update as requested and all the tests passed.
@xuzhang3 my review comment hasn't been addressed properly: #26404 (review) |
test case updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xuzhang3 LGTM 👍
<Actions> <action id="f410411e63aff4bb73a81c2aec1d373cf8a903e63b30dee2006b0030d8a94cc8"> <h3>Bump Terraform `azurerm` provider version</h3> <details id="1d9343c012f5434ac9fe8a98135bae3667b399259be16d9b14302ea3bd424a24"> <summary>Update Terraform lock file</summary> <p>changes detected:
	"hashicorp/azurerm" updated from "3.111.0" to "3.112.0" in file ".terraform.lock.hcl"</p> <details> <summary>3.112.0</summary> <pre>Changelog retrieved from:
	https://github.com/hashicorp/terraform-provider-azurerm/releases/tag/v3.112.0
FEATURES:

* New Data Source: `azurerm_elastic_san_volume_snapshot` ([#26439](hashicorp/terraform-provider-azurerm#26439 New Resource: `azurerm_dev_center_dev_box_definition` ([#26307](hashicorp/terraform-provider-azurerm#26307 New Resource: `azurerm_dev_center_environment_type` ([#26291](hashicorp/terraform-provider-azurerm#26291 New Resource: `azurerm_virtual_machine_restore_point` ([#26526](hashicorp/terraform-provider-azurerm#26526 New Resource: `azurerm_virtual_machine_restore_point_collection` ([#26526](https://github.com/hashicorp/terraform-provider-azurerm/issues/26526))

ENHANCEMENTS:

* dependencies: updating to `v0.20240710.1114656` of `github.com/hashicorp/go-azure-sdk` ([#26588](hashicorp/terraform-provider-azurerm#26588 dependencies: updating to `v0.70.0` of `go-azure-helpers` ([#26601](hashicorp/terraform-provider-azurerm#26601 `containerservice`: updating the Fleet resources to use API Version `2024-04-01` ([#26588](hashicorp/terraform-provider-azurerm#26588 Data Source: `azurerm_network_service_tags` - extend validation for `service` to allow `AzureFrontDoor.Backend`, `AzureFrontDoor.Frontend`, and `AzureFrontDoor.FirstParty` ([#26429](hashicorp/terraform-provider-azurerm#26429 `azurerm_api_management_identity_provider_aad` - support for the `client_library` property ([#26093](hashicorp/terraform-provider-azurerm#26093 `azurerm_api_management_identity_provider_aadb2c` - support for the `client_library` property ([#26093](hashicorp/terraform-provider-azurerm#26093 `azurerm_dev_test_virtual_network` - support for the `shared_public_ip_address` property ([#26299](hashicorp/terraform-provider-azurerm#26299 `azurerm_kubernetes_cluster` - support for the `certificate_authority` block under the `service_mesh_profile` block ([#26543](hashicorp/terraform-provider-azurerm#26543 `azurerm_linux_web_app` - support the value `8.3` for the `php_version` property ([#26194](hashicorp/terraform-provider-azurerm#26194 `azurerm_machine_learning_compute_cluster` - the `identity` property can now be updated ([#26404](hashicorp/terraform-provider-azurerm#26404 `azurerm_web_application_firewall_policy` - support for the `JSChallenge` value for `managed_rules.managed_rule_set.rule_group_override.rule_action` ([#26561](https://github.com/hashicorp/terraform-provider-azurerm/issues/26561))

BUG FIXES:

* Data Source: `azurerm_communication_service` - `primary_connection_string`, `primary_key`, `secondary_connection_string` and `secondary_key` are marked as Sensitive ([#26560](hashicorp/terraform-provider-azurerm#26560 `azurerm_app_configuration_feature` - fix issue when updating the resource without an existing `targeting_filter` ([#26506](hashicorp/terraform-provider-azurerm#26506 `azurerm_backup_policy_vm` - split create and update function to fix lifecycle - ignore ([#26591](hashicorp/terraform-provider-azurerm#26591 `azurerm_backup_protected_vm` - split create and update function to fix lifecycle - ignore ([#26583](hashicorp/terraform-provider-azurerm#26583 `azurerm_communication_service` - the `primary_connection_string`, `primary_key`, `secondary_connection_string`, and `secondary_key` properties are now sensitive ([#26560](hashicorp/terraform-provider-azurerm#26560 `azurerm_mysql_flexible_server_configuration` - add locks to prevent conflicts when deleting the resource ([#26289](hashicorp/terraform-provider-azurerm#26289 `azurerm_nginx_deployment` - changing the `frontend_public.ip_address`, `frontend_private.ip_address`, `frontend_private.allocation_method`, and `frontend_private.subnet_id` now creates a new resource ([#26298](hashicorp/terraform-provider-azurerm#26298 `azurerm_palo_alto_local_rulestack_rule` - correctl read the `protocol` property on read when the `protocol_ports` property is configured ([#26510](hashicorp/terraform-provider-azurerm#26510 `azurerm_servicebus_namespace` - parse the identity returned by the API insensitively before setting into state ([#26540](https://github.com/hashicorp/terraform-provider-azurerm/issues/26540))

DEPRECATIONS:

* `azurerm_servicebus_queue` - `enable_batched_operations`, `enable_express` and `enable_partitioning` are superseded by `batched_operations_enabled`, `express_enabled` and `partitioning_enabled` ([#26479](hashicorp/terraform-provider-azurerm#26479 `azurerm_servicebus_subscription` - `enable_batched_operations` has been superseded by `batched_operations_enabled` ([#26479](hashicorp/terraform-provider-azurerm#26479 `azurerm_servicebus_topic` - `enable_batched_operations`, `enable_express` and `enable_partitioning` are superseded by `batched_operations_enabled`, `express_enabled` and `partitioning_enabled` ([#26479](https://github.com/hashicorp/terraform-provider-azurerm/issues/26479))


</pre> </details> </details> <a href="https://infra.ci.jenkins.io/job/updatecli/job/azure/job/main/319/">Jenkins pipeline link</a> </action> </Actions> --- <table> <tr> <td width="77"> <img src="https://www.updatecli.io/images/updatecli.png" alt="Updatecli logo" width="50" height="50"> </td> <td> <p> Created automatically by <a href="https://www.updatecli.io/">Updatecli</a> </p> <details><summary>Options:</summary> <br /> <p>Most of Updatecli configuration is done via <a href="https://www.updatecli.io/docs/prologue/quick-start/">its manifest(s)</a>.</p> <ul> <li>If you close this pull request, Updatecli will automatically reopen it, the next time it runs.</li> <li>If you close this pull request and delete the base branch, Updatecli will automatically recreate it, erasing all previous commits made.</li> </ul> <p> Feel free to report any issues at <a href="https://github.com/updatecli/updatecli/issues">github.com/updatecli/updatecli</a>.<br /> If you find this tool useful, do not hesitate to star <a href="https://github.com/updatecli/updatecli/stargazers">our GitHub repository</a> as a sign of appreciation, and/or to tell us directly on our <a href="https://matrix.to/#/#Updatecli_community:gitter.im">chat</a>! </p> </details> </td> </tr> </table> Co-authored-by: Jenkins Infra Bot (updatecli) <[email protected]>
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions. |
identity
can be updated by current API without recreate the MLW compute cluster.Community Note
Description
PR Checklist
For example: “
resource_name_here
- description of change e.g. adding propertynew_property_name_here
”Changes to existing Resource / Data Source
Testing
Change Log
Below please provide what should go into the changelog (if anything) conforming to the Changelog Format documented here.
azurerm_resource
- support for thething1
property [GH-00000]This is a (please select all that apply):
Related Issue(s)
Fixes #0000
Note
If this PR changes meaningfully during the course of review please update the title and description as required.