Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating multiple azurerm_kusto_script resources at once causes [Conflict] errors on cluster #16471

Closed
1 task done
mojanas opened this issue Apr 20, 2022 · 6 comments · Fixed by #16649 or #16690
Closed
1 task done

Comments

@mojanas
Copy link
Contributor

mojanas commented Apr 20, 2022

Is there an existing issue for this?

  • I have searched the existing issues

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

1.0.8

AzureRM Provider Version

2.96.0

Affected Resource(s)/Data Source(s)

azurerm_kusto_script

Terraform Configuration Files

terraform {
  required_version = ">= 0.14"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 2.96.0"
    }
  }
}

provider "azurerm" {
  features {}
}

module "insights" {
  source = "./insights"
  insights = {
    clusters = {
      "extint" : {
        sku            = "Standard_E16as_v4+3TB_PS"
        min_instances  = 2
        max_instances  = 20
        engine_version = "V3"
        databases = {
          for name in local.prod_sharded_db_names : name => {
            hot_cache_period   = "P7D"
            soft_delete_period = "P30D"
            principals         = []
            setup_scripts = {
              "adx_sharded_databases_setup_commands" : {
                script_commands = local.external_database_setup_commands
              }
            }
          }
        }
      }
    }
  }
}

resource "azurerm_resource_group" "datalayer_adx_rg" {
  name     = "adxclusters"
  location = var.region_location

  tags = { "rg_scope" : "region" }
}

resource "azurerm_kusto_cluster" "adx_clusters_insights" {
  for_each = var.insights.clusters

  name                   = each.key
  resource_group_name    = azurerm_resource_group.datalayer_adx_rg.name
  location               = azurerm_resource_group.datalayer_adx_rg.location
  engine                 = each.value.engine_version
  enable_disk_encryption = true
  enable_purge           = true
  enable_auto_stop       = false

  sku {
    name     = each.value.sku
    capacity = each.value.min_instances
  }

  dynamic "optimized_auto_scale" {
    for_each = each.value.max_instances > each.value.min_instances ? [1] : []
    content {
      minimum_instances = each.value.min_instances
      maximum_instances = each.value.max_instances
    }
  }
  lifecycle {
    prevent_destroy = true
    ignore_changes = [
      sku,
      optimized_auto_scale
    ]
  }
}

resource "azurerm_kusto_database" "adx_databases" {
  for_each = { for d in local.databases : "${d.cluster_index}.${d.database_index}" => d }

  name                = each.value.database_index
  resource_group_name = azurerm_kusto_cluster.adx_clusters_insights[each.value.cluster_index].resource_group_name
  location            = azurerm_kusto_cluster.adx_clusters_insights[each.value.cluster_index].location
  cluster_name        = azurerm_kusto_cluster.adx_clusters_insights[each.value.cluster_index].name

  hot_cache_period   = each.value.hot_cache_period
  soft_delete_period = each.value.soft_delete_period

  lifecycle {
    prevent_destroy = true
    ignore_changes = [
      size,
      hot_cache_period,
      soft_delete_period
    ]
  }
}

resource "azurerm_storage_account" "adx_storage" {
  name                     = "adx"
  resource_group_name      = azurerm_resource_group.datalayer_adx_rg.name
  location                 = azurerm_resource_group.datalayer_adx_rg.location
  account_tier             = "Standard"
  account_kind             = "StorageV2"
  account_replication_type = "LRS"
  allow_blob_public_access = false
}

# Storage container for holding kusto script blob files
resource "azurerm_storage_container" "adx_scripts_container" {
  name                  = "adx-scripts"
  storage_account_name  = azurerm_storage_account.adx_storage.name
  container_access_type = "private"
}

data "azurerm_storage_account_blob_container_sas" "adx_scripts_container_sas" {
  connection_string = azurerm_storage_account.adx_storage.primary_connection_string
  container_name    = azurerm_storage_container.adx_scripts_container.name
  https_only        = true

  start  = formatdate("YYYY-MM-DD", timestamp())
  expiry = formatdate("YYYY-MM-DD", timeadd(timestamp(), "36h"))

  permissions {
    read   = true
    add    = false
    create = false
    write  = true
    delete = false
    list   = true
  }
}

resource "azurerm_storage_blob" "adx_script_blobs" {
  for_each = { for b in local.adx_scripts : "${b.cluster_index}.${b.database_index}.${b.script_index}" => b }

  name                   = "${each.value.database_index}_${each.value.script_index}.txt"
  storage_account_name   = azurerm_storage_account.adx_storage.name
  storage_container_name = azurerm_storage_container.adx_scripts_container.name
  type                   = "Block"
  source_content         = each.value.script_content

  # 2022-03-14: Kusto team recommended adding an explicit dependency due to a potential race condition
  depends_on = [
    azurerm_storage_container.adx_scripts_container
  ]
}

resource "azurerm_kusto_script" "adx_setup_scripts" {
  for_each                           = { for b in local.adx_scripts : "${b.cluster_index}.${b.database_index}.${b.script_index}" => b }
  name                               = each.value.script_index
  database_id                        = azurerm_kusto_database.adx_databases["${each.value.cluster_index}.${each.value.database_index}"].id
  url                                = azurerm_storage_blob.adx_script_blobs["${each.value.cluster_index}.${each.value.database_index}.${each.value.script_index}"].id
  sas_token                          = data.azurerm_storage_account_blob_container_sas.adx_scripts_container_sas.sas
  continue_on_errors_enabled         = true
  force_an_update_when_value_changed = "2022-04-05"

  depends_on = [
    azurerm_kusto_cluster.adx_clusters_insights
  ]

  lifecycle {
    ignore_changes = [
      sas_token
    ]
  }
}

locals {
  external_database_setup_commands = "${local.eventsAllTableSetupCommand}\n ${local.eventsAllPartitioningPolicyCommand}"

  prod_sharded_db_names = [
    "PFSharded001",
    "PFSharded002",
    "PFSharded003",
    "PFSharded004",
    "PFSharded005",
    "PFSharded006",
    "PFSharded007",
    "PFSharded008",
    "PFSharded009",
    "PFSharded010"
  ]

  databases = distinct(flatten([
    for c_index, cluster in var.insights.clusters : [
      for d_index, database in cluster.databases : {
        cluster_index      = c_index,
        database_index     = d_index,
        hot_cache_period   = database.hot_cache_period
        soft_delete_period = database.soft_delete_period
      }
    ]
  ]))

  adx_scripts = distinct(flatten([
    for c_index, cluster in var.insights.clusters : [
      for d_index, database in cluster.databases : [
        for s_index, script in database.setup_scripts : {
          cluster_index  = c_index
          database_index = d_index
          script_index   = s_index
          script_content = script.script_commands
        }
      ]
    ]
  ]))

  eventsAllTableSetupCommand = ".create-merge table ['events.all'] (SchemaVersion: string, FullName_Namespace: string, FullName_Name: string, Entity_Id: string, Entity_Type: string, EntityLineage_title: string, EventData: dynamic, EventId: string, Timestamp: datetime, EntityLineage_title_player_account: string, EntityLineage_master_player_account: string, EntityLineage_namespace: string, ExperimentVariants: dynamic)"

  eventsAllPartitioningPolicyCommand = <<-EOT
    .alter table ['events.all'] policy partitioning {
        "PartitionKeys": [
        {
            "ColumnName": "EntityLineage_title",
            "Kind": "Hash",
            "Properties": {
                "Function": "XxHash64",
                "MaxPartitionCount": 256,
                "Seed": 1,
                "PartitionAssignmentMode": "Uniform"
            }
        },
        {
            "ColumnName": "Timestamp",
            "Kind": "UniformRange",
            "Properties": {
                "Reference": "1970-01-01T00:00:00",
                "RangeSize": "1.00:00:00",
                "OverrideCreationTime": false
            }
        }
        ], "EffectiveDateTime": "2021-01-01T00:00:00.0Z"
    }```
  EOT
}

Debug Output/Panic Output

2022-04-20T10:01:27.334-0700 [DEBUG] provider.terraform-provider-azurerm_v2.96.0_x5.exe: AzureRM Request:
GET /subscriptions/sid/providers/Microsoft.Kusto/locations/East%20US/operationResults/b219287e-e96c-4956-8953-9311c43984f0?api-version=2021-08-27 HTTP/1.1
Host: management.azure.com
User-Agent: Go/go1.17.5 (amd64-windows) go-autorest/v14.2.1 Azure-SDK-For-Go/v61.4.0 kusto/2021-08-27 HashiCorp Terraform/1.0.8 (+https://www.terraform.io) Terraform Plugin SDK/2.10.1 terraform-provider-azurerm/2.96.0 pid-222c6c49-1b0a-5959-a213-6608f9eb8820
X-Ms-Correlation-Request-Id: 72d53c44-9a30-55af-7d8d-6f79a4c111ed
Accept-Encoding: gzip: timestamp=2022-04-20T10:01:27.334-0700
2022-04-20T10:01:27.654-0700 [DEBUG] provider.terraform-provider-azurerm_v2.96.0_x5.exe: AzureRM Response for https://management.azure.com/subscriptions/sid/providers/Microsoft.Kusto/locations/East%20US/operationResults/b219287e-e96c-4956-8953-9311c43984f0?api-version=2021-08-27:
HTTP/2.0 200 OK
Cache-Control: no-cache
Content-Type: application/json; charset=utf-8
Date: Wed, 20 Apr 2022 17:01:27 GMT
Expires: -1
Pragma: no-cache
Server: Microsoft-HTTPAPI/2.0
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
X-Ms-Correlation-Request-Id: 72d53c44-9a30-55af-7d8d-6f79a4c111ed
X-Ms-Operation-Root-Activity-Id: aadd5697-46b0-47d7-807c-8c5b059d5317
X-Ms-Ratelimit-Remaining-Subscription-Resource-Requests: 290
X-Ms-Request-Id: 437865b4-14cb-48ff-8c24-29774588beb6
X-Ms-Routing-Request-Id: WESTUS:20220420T170127Z:97594494-5b1b-4922-b104-12e1c946e140

{"id":"/subscriptions/sid/providers/Microsoft.Kusto/locations/East US/operationresults/b219287e-e96c-4956-8953-9311c43984f0","name":"b219287e-e96c-4956-8953-9311c43984f0","status":"Succeeded","startTime":"2022-04-20T17:00:56.9075354Z","endTime":"2022-04-20T17:01:00.3141761Z","percentComplete":1.0,"properties":{"operationKind":"DatabaseScriptCreateOrUpdate","provisioningState":"Succeeded","operationState":"Completed"}}: timestamp=2022-04-20T10:01:27.654-0700
2022-04-20T10:01:27.654-0700 [DEBUG] provider.terraform-provider-azurerm_v2.96.0_x5.exe: AzureRM Request:
GET /subscriptions/sid/resourceGroups/adxclusters/providers/Microsoft.Kusto/clusters/extint/databases/PFSharded005/scripts/adx_sharded_databases_setup_commands?api-version=2021-08-27 HTTP/1.1
Host: management.azure.com
User-Agent: Go/go1.17.5 (amd64-windows) go-autorest/v14.2.1 Azure-SDK-For-Go/v61.4.0 kusto/2021-08-27 HashiCorp Terraform/1.0.8 (+https://www.terraform.io) Terraform Plugin SDK/2.10.1 terraform-provider-azurerm/2.96.0 pid-222c6c49-1b0a-5959-a213-6608f9eb8820
X-Ms-Correlation-Request-Id: 72d53c44-9a30-55af-7d8d-6f79a4c111ed
Accept-Encoding: gzip: timestamp=2022-04-20T10:01:27.654-0700
2022-04-20T10:01:28.155-0700 [DEBUG] provider.terraform-provider-azurerm_v2.96.0_x5.exe: AzureRM Response for https://management.azure.com/subscriptions/sid/resourceGroups/adxclusters/providers/Microsoft.Kusto/clusters/extint/databases/PFSharded005/scripts/adx_sharded_databases_setup_commands?api-version=2021-08-27:
HTTP/2.0 200 OK
Cache-Control: no-cache
Content-Type: application/json; charset=utf-8
Date: Wed, 20 Apr 2022 17:01:27 GMT
Expires: -1
Pragma: no-cache
Server: Microsoft-HTTPAPI/2.0
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
X-Ms-Correlation-Request-Id: 72d53c44-9a30-55af-7d8d-6f79a4c111ed
X-Ms-Ratelimit-Remaining-Subscription-Resource-Requests: 290
X-Ms-Request-Id: a700e59c-895e-4fe4-a496-18943ba1fa6c
X-Ms-Routing-Request-Id: WESTUS:20220420T170128Z:139a204d-4e16-4456-9232-b26f86cdab91

{"id":"/subscriptions/sid/resourceGroups/adxclusters/providers/Microsoft.Kusto/Clusters/extint/Databases/PFSharded005/Scripts/adx_sharded_databases_setup_commands","name":"extint/PFSharded005/adx_sharded_databases_setup_commands","type":"Microsoft.Kusto/Clusters/Databases/Scripts","properties":{"continueOnErrors":true,"scriptUrl":"https://adx.blob.core.windows.net/adx-scripts/PFSharded005_adx_sharded_databases_setup_commands.txt","forceUpdateTag":"2022-04-05","provisioningState":"Succeeded"},"systemData":{"createdBy":"mojanas@.com","createdByType":"User","createdAt":"2022-04-20T17:00:56.5287787Z","lastModifiedBy":"mojanas@.com","lastModifiedByType":"User","lastModifiedAt":"2022-04-20T17:00:56.5287787Z"}}: timestamp=2022-04-20T10:01:28.155-0700

Expected Behaviour

Terraform creates the scripts sequentially without any errors.

Actual Behaviour

Terraform created one script successfully, then failed for the rest with Code="ServiceIsInMaintenance"

module.insights.azurerm_kusto_script.adx_setup_scripts["extint.PFSharded005.adx_sharded_databases_setup_commands"]: Creation complete after 32s [id=/subscriptions/sid/resourceGroups/adxclusters/providers/Microsoft.Kusto/Clusters/extint/Databases/PFSharded005/Scripts/adx_sharded_databases_setup_commands]
╷
│ Error: waiting for creation of "Script: (Name \"adx_sharded_databases_setup_commands\" / Database Name \"PFSharded009\" / Cluster Name \"extint\" / Resource Group \"adxclusters\")": Code="ServiceIsInMaintenance" Message="[Conflict] Cluster 'extint' is in process of maintenance for a short period. You may retry to invoke the operation in a few minutes."

│   on insights\adx-scripts.tf line 56, in resource "azurerm_kusto_script" "adx_setup_scripts":
│   56: resource "azurerm_kusto_script" "adx_setup_scripts" {



│ Error: waiting for creation of "Script: (Name \"adx_sharded_databases_setup_commands\" / Database Name \"PFSharded010\" / Cluster Name \"extint\" / Resource Group \"adxclusters\")": Code="ServiceIsInMaintenance" Message="[Conflict] Cluster 'extint│
│   with module.insights.azurerm_kusto_script.adx_setup_scripts["extint.PFSharded010.adx_sharded_databases_setup_commands"],
│   on insights\adx-scripts.tf line 56, in resource "azurerm_kusto_script" "adx_setup_scripts":
│   56: resource "azurerm_kusto_script" "adx_setup_scripts" {



│ Error: waiting for creation of "Script: (Name \"adx_sharded_databases_setup_commands\" / Database Name \"PFSharded007\" / Cluster Name \"extint\" / Resource Group \"adxclusters\")": Code="ServiceIsInMaintenance" Message="[Conflict] Cluster 'extint' is in process of maintenance for a short period. You may retry to invoke the operation in a few minutes."

│   with module.insights.azurerm_kusto_script.adx_setup_scripts["extint.PFSharded007.adx_sharded_databases_setup_commands"],
│   on insights\adx-scripts.tf line 56, in resource "azurerm_kusto_script" "adx_setup_scripts":
│   56: resource "azurerm_kusto_script" "adx_setup_scripts" {



│ Error: waiting for creation of "Script: (Name \"adx_sharded_databases_setup_commands\" / Database Name \"PFSharded003\" / Cluster Name \"extint\" / Resource Group \"adxclusters\")": Code="ServiceIsInMaintenance" Message="[Conflict] Cluster 'extint' is in process of maintenance for a short period. You may retry to invoke the operation in a few minutes."

│   with module.insights.azurerm_kusto_script.adx_setup_scripts["extint.PFSharded003.adx_sharded_databases_setup_commands"],
│   on insights\adx-scripts.tf line 56, in resource "azurerm_kusto_script" "adx_setup_scripts":
│   56: resource "azurerm_kusto_script" "adx_setup_scripts" {



│ Error: waiting for creation of "Script: (Name \"adx_sharded_databases_setup_commands\" / Database Name \"PFSharded004\" / Cluster Name \"extint\" / Resource Group \"adxclusters\")": Code="ServiceIsInMaintenance" Message="[Conflict] Cluster 'extint' is in process of maintenance for a short period. You may retry to invoke the operation in a few minutes."

│   with module.insights.azurerm_kusto_script.adx_setup_scripts["extint.PFSharded004.adx_sharded_databases_setup_commands"],
│   on insights\adx-scripts.tf line 56, in resource "azurerm_kusto_script" "adx_setup_scripts":
│   56: resource "azurerm_kusto_script" "adx_setup_scripts" {



│ Error: waiting for creation of "Script: (Name \"adx_sharded_databases_setup_commands\" / Database Name \"PFSharded006\" / Cluster Name \"extint\" / Resource Group \"adxclusters\")": Code="ServiceIsInMaintenance" Message="[Conflict] Cluster 'extint' is in process of maintenance for a short period. You may retry to invoke the operation in a few minutes."

│   with module.insights.azurerm_kusto_script.adx_setup_scripts["extint.PFSharded006.adx_sharded_databases_setup_commands"],
│   on insights\adx-scripts.tf line 56, in resource "azurerm_kusto_script" "adx_setup_scripts":
│   56: resource "azurerm_kusto_script" "adx_setup_scripts" {



│ Error: waiting for creation of "Script: (Name \"adx_sharded_databases_setup_commands\" / Database Name \"PFSharded008\" / Cluster Name \"extint\" / Resource Group \"adxclusters\")": Code="ServiceIsInMaintenance" Message="[Conflict] Cluster 'extint' is in process of maintenance for a short period. You may retry to invoke the operation in a few minutes."

│   with module.insights.azurerm_kusto_script.adx_setup_scripts["extint.PFSharded008.adx_sharded_databases_setup_commands"],
│   on insights\adx-scripts.tf line 56, in resource "azurerm_kusto_script" "adx_setup_scripts":
│   56: resource "azurerm_kusto_script" "adx_setup_scripts" {



│ Error: waiting for creation of "Script: (Name \"adx_sharded_databases_setup_commands\" / Database Name \"PFSharded002\" / Cluster Name \"extint\" / Resource Group \"adxclusters\")": Code="ServiceIsInMaintenance" Message="[Conflict] Cluster 'extint' is in process of maintenance for a short period. You may retry to invoke the operation in a few minutes."

│   with module.insights.azurerm_kusto_script.adx_setup_scripts["extint.PFSharded002.adx_sharded_databases_setup_commands"],
│   on insights\adx-scripts.tf line 56, in resource "azurerm_kusto_script" "adx_setup_scripts":
│   56: resource "azurerm_kusto_script" "adx_setup_scripts" {

Steps to Reproduce

Use the configuration above or create a new configuration with 5+ databases and scripts all defined within the same resource and looped using for_each

Important Factoids

No response

References

No response

@sinbai
Copy link
Contributor

sinbai commented Apr 28, 2022

@mojanas thank you for opening this issue. It seems that this is an Azure API concurrency issue. I have filed an issue to track this. Could you subscribe to it to track this issue?

@mojanas
Copy link
Contributor Author

mojanas commented Apr 29, 2022

@mojanas thank you for opening this issue. It seems that this is an Azure API concurrency issue. I have filed an issue to track this. Could you subscribe to it to track this issue?

Thanks, subscribed to that issue. The best workaround I've found so far is to create a single resource for each script, add an explicit dependency on the previous one to force sequential creation. This workaround is quite verbose so it would be ideal to have a fix for the concurrency issue. Perhaps when we get the error code that the cluster is in maintenance, the provider waits some amount of time before trying again.

locals {
   numScripts = 2
}

resource "azurerm_kusto_script" "adx_setup_script_subset_index_zero" {
  for_each = { for b in local.adx_scripts : "${b.cluster_index}.${b.database_index}.${b.script_index}" => b if index(local.adx_scripts, b) % local.numScripts == 0 }

  name                               = "${each.value.database_index}_${each.value.script_index}"
  ...

  # Create an explicit dependency on the cluster
  depends_on = [
    azurerm_kusto_cluster.cluster
  ]
}

resource "azurerm_kusto_script" "adx_setup_script_subset_index_one" {
  for_each = { for b in local.adx_scripts : "${b.cluster_index}.${b.database_index}.${b.script_index}" => b if index(local.adx_scripts, b) % local.numScripts == 1 }

  name                               = "${each.value.database_index}_${each.value.script_index}"
  ...

  # Create an explicit dependency on the previous subset of scripts to avoid Conflict errors
  depends_on = [
    azurerm_kusto_script.adx_setup_script_subset_index_zero
  ]
}

... more scripts

@github-actions
Copy link

github-actions bot commented May 6, 2022

This functionality has been released in v3.5.0 of the Terraform Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@mbfrahry
Copy link
Member

mbfrahry commented May 6, 2022

Hey @mojanas and @sinbai, I went ahead and threw a lock into kusto database/scripts so that managing those resources happens sequentially rather than all at once which was causing your issue. You shouldn't have to do any weird code things in your config file to use that once you upgrade to the latest version of the provider.

@mojanas
Copy link
Contributor Author

mojanas commented May 6, 2022

Hey @mbfrahry thanks for looking into this! I updated to v3.5.0 and am still seeing the same error. Looking at your PR fix, I believe we actually want to lock the cluster instead of the databases. Though the kusto scripts are created with respect to a particular database, they actually modify the cluster configuration.

I have a draft PR here, feel free to add updates to it: #16690
First time using Go so I haven't run the tests or set up the environment yet.

@github-actions
Copy link

github-actions bot commented Jun 6, 2022

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.