Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading the module causes problem with null_resource (Missing map element) #426

Closed
kayman-mk opened this issue Jan 8, 2022 · 9 comments · Fixed by #441
Closed

Upgrading the module causes problem with null_resource (Missing map element) #426

kayman-mk opened this issue Jan 8, 2022 · 9 comments · Fixed by #441
Labels

Comments

@kayman-mk
Copy link
Collaborator

kayman-mk commented Jan 8, 2022

After updating to 4.37 and applying the plan, an error ocurred. Details below.

As far as I can see, it has nothing to do with the 4.37 upgrade as there were no changes to the null_resource. I think it is due to the fact that I didn't upgrade to 4.36 (and maybe the upgrade to 4.35 was also missing) and the problem was introduced in an earlier version.

However I do not fully understand what's going on here. Seems that the old resource had different keys in the trigger map (yes, I remember the change). And in consequence the null_resource is destroyed and executes the code which is not working as the new parameter is not in the map.
Seems that changing the null_resource ends with destroy/create this resource and then the trigger is fired. This was not intended.

In this case I fixed it with deleting the objects from the state file.

Plan:

  # module.gitlab_runner.module.gitlab_terraform_runner["subnet-12345"].null_resource.remove_runner must be replaced
-/+ resource "null_resource" "remove_runner" {
      ~ id       = "596161117442277782" -> (known after apply)
      ~ triggers = { # forces replacement
          + "runner_registration_token"               = (sensitive)
          - "script"                                  = ".terraform/modules/gitlab_runner.gitlab_terraform_runner/bin/remove-runner.sh" -> null
          - "secure_parameter_store_runner_token_key" = "prod-gitlab-ci-c-token" -> null
            # (2 unchanged elements hidden)
        }
    }

State:

$ terraform state show module.gitlab_runner.module.gitlab_terraform_runner[\"subnet-12345\"].null_resource.remove_runner
# module.gitlab_runner.module.gitlab_terraform_runner["subnet-12345"].null_resource.remove_runner:
resource "null_resource" "remove_runner" {
    id       = "596161117442277782"
    triggers = {
        "aws_region"                              = "eu-central-1"
        "runners_gitlab_url"                      = "https://gitlab.xxx.cloud"
        "script"                                  = ".terraform/modules/gitlab_runner.gitlab_terraform_runner/bin/remove-runner.sh"
        "secure_parameter_store_runner_token_key" = "prod-gitlab-ci-c-token"
    }
}

Result:

│ Error: Missing map element
│
│   on .terraform\modules\gitlab_runner.gitlab_terraform_runner\main.tf line 43, in resource "null_resource" "remove_runner":
│   43:     command    = "curl -sS --request DELETE \"${self.triggers.runners_gitlab_url}/api/v4/runners\" --form \"token=${self.triggers.runner_registration_token}\""
│     ├────────────────
│     │ self.triggers is map of string with 4 elements
│
│ This map does not have an element with the key "runner_registration_token".
@kayman-mk kayman-mk changed the title Upgrading the module causes problem with null_resource Upgrading the module causes problem with null_resource (Missing map element) Jan 8, 2022
@Hijus22
Copy link

Hijus22 commented Jan 26, 2022

It is the intended behavior of Terraform. A migration guide should be added as this breaks any automatic upgrade from any version < 4.36 which included the compatibility breaking commit to version >= 4.36

A fix that could be transparent to users without having to mess with the state is to have version 4.35.1 created as a transition to versions >= 4.36 which corrects this by having the new key in the map:

triggers = {
    script                                  = "${path.module}/bin/remove-runner.sh"
    aws_region                              = var.aws_region
    runners_gitlab_url                      = var.runners_gitlab_url
    secure_parameter_store_runner_token_key = local.secure_parameter_store_runner_token_key
    runner_registration_token               = data.aws_ssm_parameter.current_runner_registration_token.value
  }

@kayman-mk
Copy link
Collaborator Author

Question to @npalm: Do we really need the null_resource? If I understand it correctly it removes the Runner as soon as the module is destroyed. We could also document this in a special section of the documentation (what to do if the Runner is removed) and remove the null_resource. Any change to the resource as described above, will always remove the Runner which is not intended.

If I destroy the module, all resources are destroyed, but the Runner will still be shown in GItlab which shouldn't be a problem.

@Hijus22
Copy link

Hijus22 commented Jan 27, 2022

Well, it depends. If you did the registration manually, which is now deprecated as stated here, then its okay. But otherwise it is wrong because the apply and destroy should be atomic and the registration is done by the module.

@npalm
Copy link
Collaborator

npalm commented Jan 27, 2022

On my list, try to dig in next days

@vgropp
Copy link

vgropp commented Feb 9, 2022

@npalm any news on this? We ran into this problem as well. While trying to understand the problem, we did wonder whether https://github.com/npalm/terraform-aws-gitlab-runner/blob/6960f3bb794ca8279932f43cc98723c45a84ba14/main.tf#L32 should maybe depend on data.aws_ssm_parameter instead.

But to fix it for now, does manually modifying the terraform state maybe mitigate the problem? How did you resolve it @kayman-mk ?

@Hijus22
Copy link

Hijus22 commented Feb 9, 2022

@vgropp you need to remove the null resource completely. It doesn't affect anything really, when you do the terraform apply, the null resource will be created with the new config:

  1. terraform state list will show you everything in the state, you need to look for the "remove_runner" null resource.
  2. terraform state rm module.<runners-module-name>.null_resource.remove_runner.

@kayman-mk
Copy link
Collaborator Author

I thought about that. Getting rid of the whole resources as it might accidentially remove the runner when applying a new version of the module. The downside is that the runner is not removed if the module is destroyed. But to be honest: I will never remove this module. I am totally satisfied how everything is working.

@npalm Shall we get rid of the whole null_resource? The better option would be to deregister the runner on shutdown.

@npalm
Copy link
Collaborator

npalm commented Feb 10, 2022

Agree we should see to remove the null resources. The only time I create / destroy is for testing. That logic can be complete moved to the examples.

@semantic-releaser
Copy link
Contributor

🎉 This issue has been resolved in version 4.41.1 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants