Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application Gateway request_routing_rule order change even with Azurerm 3.0.2 #16136

Open
aport1996 opened this issue Mar 29, 2022 · 59 comments
Open

Comments

@aport1996
Copy link

aport1996 commented Mar 29, 2022

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

Terraform v1.1.7
on windows_amd64

  • provider registry.terraform.io/hashicorp/azurerm v3.0.2

Affected Resource(s)

  • azurerm_application_gateway

Terraform Configuration Files

  dynamic request_routing_rule {
    for_each = var.application_gateway_request_routing_rule

    content  {
    http_listener_name = request_routing_rule.value["http_listener_name"]
    name = request_routing_rule.value["name"]
    redirect_configuration_name = request_routing_rule.value["redirect_configuration_name"]
    rule_type  = request_routing_rule.value["rule_type"]
    backend_address_pool_name = request_routing_rule.value["backend_address_pool_name"]
    backend_http_settings_name = request_routing_rule.value["backend_http_settings_name"]
    url_path_map_name = request_routing_rule.value["url_path_map_name"]
    }
  }

    {
      http_listener_name          = "HTTP-DEV-XXX-LISTENER"
      name                        = "XXX-DEV-HTTPS-REDIRECT-RULE"
      redirect_configuration_name = "XXX-DEV-HTTPS-REDIRECT"
      rule_type                   = "Basic"
      backend_address_pool_name   = null
      backend_http_settings_name  = null
      url_path_map_name           = null
    },

Debug Output

  - request_routing_rule {
      - http_listener_id            = "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxx/httpListeners/HTTP-DEV-XXX-LISTENER" -> null
      - http_listener_name          = "HTTP-DEV-XXX-LISTENER" -> null
      - id                          = "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxx/requestRoutingRules/XXX-DEV-HTTPS-REDIRECT-RULE" -> null   
      - name                        = "XXX-DEV-HTTPS-REDIRECT-RULE" -> null
      - priority                    = 0 -> null
      - redirect_configuration_id   = "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxx/redirectConfigurations/XXX-DEV-HTTPS-REDIRECT" -> null     
      - redirect_configuration_name = "XXX-DEV-HTTPS-REDIRECT" -> null
      - rule_type                   = "Basic" -> null
    }

  + request_routing_rule {
      + http_listener_id            = "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxx/httpListeners/HTTP-DEV-XXX-LISTENER"
      + http_listener_name          = "HTTP-DEV-XXX-LISTENER"
      + id                          = "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxx/requestRoutingRules/XXX-DEV-HTTPS-REDIRECT-RULE"
      + name                        = "XXX-DEV-HTTPS-REDIRECT-RULE"
      + redirect_configuration_id   = "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/applicationGateways/xxx/redirectConfigurations/XXX-DEV-HTTPS-REDIRECT"
      + redirect_configuration_name = "XXX-DEV-HTTPS-REDIRECT"
      + rule_type                   = "Basic"
    }

Panic Output

Expected Behaviour

No change detected, I also tried to run terraform apply, and it does somehow "modify" the application gateway, but if I run terraform plan again I still have the same issue.

Actual Behaviour

Terraform tries to change the order of request_routing_rules only (and all of them, I only provided you one sample output since we have many of them on this app gateway). It keeps happening even after a terraform apply.

Steps to Reproduce

  1. Configure request_routing_rule using dynamic blocks as per the above code example in an application gateway
  2. terraform plan - you will see the attempted change
  3. terraform apply - terraform will apply the change even though there is no difference
  4. terraform plan - terraform still tries to do the same changes

Important Factoids

Some time ago when this issue was not known yet, I remember that I tried to create a new application gateway from scratch and the issue was not there, but after some months between various changes, the issue appeared again randomly and never went away. I don't know what is causing the issue but we have it on 3 different application gateways and can't get rid of it.

References

@aport1996
Copy link
Author

aport1996 commented Mar 30, 2022

Just wanted to add that on another application gateway, the issue is on backend_http_settings, so I assume this is just randomly happening on all the blocks. Not sure why one or the other is specifically affected though in each separate app gateway. But for this specific case, there was actually a difference in the http settings and once I fixed that (in the code only by aligning to what was in the portal) and ran the plan again no infrastructure changes were detected.

I still believe that according to the new version of azurerm I should've seen only the change in one of the backend_http_settings and not the addition and removal of all of them.

@owaisaamir

This comment was marked as off-topic.

@mbfrahry

This comment was marked as off-topic.

@owaisaamir

This comment was marked as off-topic.

@mbfrahry

This comment was marked as off-topic.

@mbfrahry
Copy link
Member

Hey @Nyxbiker, what was your configuration for application gateway and what did you have to do to your config to get it to line up with the portal? My first thought is that we're not generating the Hash for backend_http_settings correctly so it'd be useful to see which attributes you had to modify to prevent a plan from occurring

@owaisaamir

This comment was marked as off-topic.

@mbfrahry

This comment was marked as off-topic.

@johannespetereit

This comment was marked as off-topic.

@mbfrahry

This comment was marked as off-topic.

@johannespetereit
Copy link

@mbfrahry thanks for your reply.
I think I'm grasping the issue, but I also think that many, many customers were waiting for an adertised fix with 3.0. I also realize that another attempt will probably not happen until the next major update of this provider, that is far, far away, which is a bit frustrating.
I will ask for the original issue to be opened again. In my view, this will have no way of moving onwards - azurerm is a down stream api to terraform. Azure API is a upstream api to azurerm. I totally understand terraform with this being categorized as a minor optimization for the terraform-team, the azurerm provider is in charge of handling the core logic (getting the current state and providing the planned state, terraform only supplies a diff).
In my experience it is not helpful to hope that an upstream api will change on accord of a single downstream provider having difficulties getting their api to comply to the contract, and I don't think this paradigm will shift because of "community preasure" of a single provider.
We will therefore start looking into alternatives which are still in the "ARM-days" history of our repos.

@aport1996
Copy link
Author

aport1996 commented Mar 31, 2022

Hey @Nyxbiker, what was your configuration for application gateway and what did you have to do to your config to get it to line up with the portal? My first thought is that we're not generating the Hash for backend_http_settings correctly so it'd be useful to see which attributes you had to modify to prevent a plan from occurring

Hi @mbfrahry, the config for backend_http_settings is also a dynamic block as below:

`dynamic backend_http_settings {
for_each = var.application_gateway_backend_http_settings

content  {
  name  = backend_http_settings.value["name"]
  host_name  = backend_http_settings.value["host_name"]
  cookie_based_affinity = backend_http_settings.value["cookie_based_affinity"]
  affinity_cookie_name = backend_http_settings.value["affinity_cookie_name"]
  pick_host_name_from_backend_address = backend_http_settings.value["pick_host_name_from_backend_address"]
  port                  = backend_http_settings.value["port"]
  protocol              = backend_http_settings.value["protocol"]
  probe_name            = backend_http_settings.value["probe_name"]
  path                  = backend_http_settings.value["path"]
  trusted_root_certificate_names = backend_http_settings.value["trusted_root_certificate_names"]
  request_timeout       = backend_http_settings.value["request_timeout"]
}

}`

And I basically just noticed that in the Azure Portal we had some settings with cookie affinity enabled, so I proceeded to align these properties in the code by changing "cookie_based_affinity" to "Enabled" and "affinity_cookie_name" to the cookie name that was set in the Portal.

I think that this specific issue is related to what you were discussing above with johannespetereit and owaisaamir though, and I agree with them that this is a huge issue because especially in big configurations (we have 51 request routing rules in one app gateway only) it becomes super difficult to figure out what has changed, that would be causing the huge Terraform output for one small property difference.

Coming back to the original issue, I noticed in the output that Terraform adds a property "- priority = 0 -> null" in the request_routing_rule that should be "removed".
So I thought that I should maybe add "property = 0" to the request routing rules since Terraform might see it as a difference (although it's marked as an optional property in the docs) and cause the huge output, but I then get Error: expected request_routing_rule.49.priority to be in the range (1 - 20000), got 0 as an error, so I couldn't test it.

I'm not sure if this is related to the issue that I'm currently experiencing though because I don't have this issue in 1 out of 3 application gateways (that are all using the same parent module with dynamic blocks).

Please note that the initial example is just for one sample routing_rule, but I have this removal and addition issue for all of the request_routing_rules in the affected application gateways. I checked if there were any differences between the code and the portal and I couldn't find any. Also, even after running terraform apply (that should just align everything that isn't) I still have the issue after running terraform plan again, which makes me think that this "priority" property that I see in the output might be causing the issue (but we don't have it set either in the code or the portal). I also tried to ignore the "priority" property in request_routing_rules to see if it would fix the issue, but I can't since lifecycle ignore does not support splat expressions etc.

@Huntermsi

This comment was marked as off-topic.

@mahmoudghorbelMG

This comment was marked as off-topic.

@eissko

This comment was marked as off-topic.

@nomoresecrets

This comment was marked as off-topic.

@adamrushuk

This comment was marked as off-topic.

@mahmoudghorbelMG

This comment was marked as off-topic.

@johannespetereit

This comment was marked as off-topic.

@dsiperek-vendavo

This comment was marked as off-topic.

@mahmoudghorbelMG
Copy link

I implemented a provider to overcome such behaviors.
https://registry.terraform.io/providers/Citeo/azurermagw/0.3.0
I am not golang dev expert, but i have done my best as a devops :).
Currently, I use it in dev environnent and it works fine.
if you can test it and make feedbacks, I’ll be grateful.

@torivara

This comment was marked as off-topic.

@eissko

This comment was marked as off-topic.

@rolandjohann

This comment was marked as off-topic.

@mmohoney

This comment was marked as off-topic.

@velmafia

This comment was marked as off-topic.

@eissko
Copy link

eissko commented Apr 14, 2023

@mbfrahry please where we can watch at progress and perhaps help with application gateway refactoring you mentioned here - #19963 (comment)

Thank you,
Peter

@cveld
Copy link

cveld commented May 1, 2023

Is there any way to workaround this behavior? I have two request_routing_rule blocks. In the state the priority is sorted ["20", "10"] but during plan phase the plan reports ["10", "20"]. This causes a change in any plan run. What would be the property that I can use as a workaround? Asssuming there is some magic sorting implemented in the azurerm provider.

@eissko
Copy link

eissko commented May 1, 2023

Is there any way to workaround this behavior? I have two request_routing_rule blocks. In the state the priority is sorted ["20", "10"] but during plan phase the plan reports ["10", "20"]. This causes a change in any plan run. What would be the property that I can use as a workaround? Asssuming there is some magic sorting implemented in the azurerm provider.

There is no straightforward workaround. You can try as mentioned:

@odeeka
Copy link

odeeka commented Jul 18, 2023

I implemented a provider to overcome such behaviors. https://registry.terraform.io/providers/Citeo/azurermagw/0.3.0 I am not golang dev expert, but i have done my best as a devops :). Currently, I use it in dev environnent and it works fine. if you can test it and make feedbacks, I’ll be grateful.

I try to use but got API error for 'location' attribute that isn't existing in provider.

azurermagw_binding_service.binding-service-resource: Creating...

│ Error: Unable to create the resource. ######## API response = 400
│ {
│ "error": {
│ "code": "LocationRequired",
│ "message": "The location property is required for this definition."
│ }
│ }

│ with azurermagw_binding_service.binding-service-resource,
│ on main.tf line 40, in resource "azurermagw_binding_service" "binding-service-resource":
│ 40: resource "azurermagw_binding_service" "binding-service-resource" {

│ Check the API response

@rbev
Copy link

rbev commented Oct 12, 2023

Is there any sort of timeline on this being fixed? this is a pretty frustrating bug

@rbev
Copy link

rbev commented Oct 12, 2023

Looking into the plan output as json i do see things like this:

{
  "before": {
    "request_routing_rule": [
      {
        "backend_address_pool_id": "",
        "backend_address_pool_name": "",
        "backend_http_settings_id": "",
        "backend_http_settings_name": "",
        "http_listener_id": "/subscriptions/REDACTED/resourceGroups/REDACTED/providers/Microsoft.Network/applicationGateways/REDACTED/httpListeners/global-default-https-redirect",
        "http_listener_name": "global-default-https-redirect",
        "id": "/subscriptions/REDACTED/resourceGroups/REDACTED/providers/Microsoft.Network/applicationGateways/REDACTED/requestRoutingRules/global-default-https-redirect",
        "name": "global-default-https-redirect",
        "priority": 22,
        "redirect_configuration_id": "/subscriptions/REDACTED/resourceGroups/REDACTED/providers/Microsoft.Network/applicationGateways/REDACTED/redirectConfigurations/global-default-https-redirect",
        "redirect_configuration_name": "global-default-https-redirect",
        "rewrite_rule_set_id": "",
        "rewrite_rule_set_name": "",
        "rule_type": "Basic",
        "url_path_map_id": "",
        "url_path_map_name": ""
      }
    ]
  },
  "after": {
    "request_routing_rule": [
      {
        "backend_address_pool_id": "",
        "backend_address_pool_name": null,
        "backend_http_settings_id": "",
        "backend_http_settings_name": null,
        "http_listener_id": "/subscriptions/REDACTED/resourceGroups/REDACTED/providers/Microsoft.Network/applicationGateways/REDACTED/httpListeners/global-default-https-redirect",
        "http_listener_name": "global-default-https-redirect",
        "id": "/subscriptions/REDACTED/resourceGroups/REDACTED/providers/Microsoft.Network/applicationGateways/REDACTED/requestRoutingRules/global-default-https-redirect",
        "name": "global-default-https-redirect",
        "priority": 22,
        "redirect_configuration_id": "/subscriptions/REDACTED/resourceGroups/REDACTED/providers/Microsoft.Network/applicationGateways/REDACTED/redirectConfigurations/global-default-https-redirect",
        "redirect_configuration_name": "global-default-https-redirect",
        "rewrite_rule_set_id": "",
        "rewrite_rule_set_name": null,
        "rule_type": "Basic",
        "url_path_map_id": "",
        "url_path_map_name": null
      }
    ]
  }
}

is it just that the provider is using null for the omitted variables and azure is sending back empty string?
it can't be ordering because this one is the first item in both lists, and always shows as a delete/creatwe

@samrobillard
Copy link

I'm having the same issue but with http_listeners where it always recreates the listeners because the host_name changes to null for some reason.

@Nopesound
Copy link

I don't know if I can help you, I had the same problem and solved it this way.
In my case, I opened the state file and replicated the exact order of both the configurations and the properties contained therein.
I eliminated all the changes I had made in the configurations (of 4 rules only one had changed, this is a part important) I made the only change that was made by the portal. It was not present either in the status or in the terraform configuration.
In this way I saw that the plan no longer highlighted the deletion of the rules, at this point, I reintroduced all the changes to the rule and relaunched the plan and Terraform gave me the same scenario, all the routing rules had to be deleted and then recreated.
Talking about it together with a colleague, with whom we spent the afternoon banging our heads about this thing, we concluded that since the routing rules are an array, modifying their order or even the properties contained in a single one from this behaviour, the complete elimination of all elements and their recreation. At this point, a question arises: Is there a latency between writing a routing rule and its actual implementation?

@IopenDoor
Copy link

I was able to fix it with azurerm 3.92

@andyr8939
Copy link

I had this issue with the latest provider 3.100.4 and after way too long troubleshooting I found it was incorrect backend settings on a pathbased route rule.

Basically what was happening was the route rule had recently been changed from basic to a path based, so a url_path_map was added. But the backend settings had been left in the request_routing_rule section, but they are not used there when you do PathBasedRouting. Instead, they move into the url_path_map and from part of the default.

As an example, I just commented out those 2 commented lines and that solved my problem.

  request_routing_rule {
    name                       = "routerule-webapp-443"
    rule_type                  = "PathBasedRouting"
    http_listener_name         = "listener-webapp-443"
    # backend_address_pool_name  = "bepool-webapp-empty"
    # backend_http_settings_name = "behttp-webapp-443"
    priority                   = "340"
    url_path_map_name          = "urlmap-webapp-443"
  }

  url_path_map {
    name                               = "urlmap-webapp-443"
    default_backend_address_pool_name  = "bepool-webapp-empty"
    default_backend_http_settings_name = "behttp-webapp-443"

    path_rule {
      name                       = "frontend"
      backend_address_pool_name  = "bepool-webapp"
      backend_http_settings_name = "behttp-webapp-443"

      paths = [
        "/*",
      ]
    }
  }

@dsczltch
Copy link

dsczltch commented Jul 18, 2024

@mbfrahry I can see this ticket is assigned to you since last year.
Could you please ask App gateway team to improve the Azure Application gateway API and fix this terraform provider?
Our team is looking for this fix, the service is difficult to update with the current behaviour and most of our downtimes come from this.

Github ticket #6896 opened in 2020 has not fixed the root issue even though its closed. :/

@chuncheungy
Copy link

On latest terraform document application_gateway,

The backend_address_pool, backend_http_settings, http_listener, private_link_configuration, request_routing_rule, redirect_configuration, probe, ssl_certificate, and frontend_port properties are Sets as the service API returns these lists of objects in a different order from how the provider sends them. As Sets are stored using a hash, if one value is added or removed from the Set, Terraform considers the entire list of objects changed and the plan shows that it is removing every value in the list and re-adding it with the new information. Though Terraform is showing all the values being removed and re-added, we are not actually removing anything unless the user specifies a removal in the configfile.

Do we really have downtime when modifying a part of the rules even though in the Terraform plan it "removes and adds" the whole set of rules? I ran a curl loop to test accessibility and did not notice any errors during terraform apply.

@dsczltch
Copy link

@chuncheungy Since dry run is unreadable, in case of misconfigurations from the engineers, it's impossible to spot them in the dry run which produce a downtime.
We have multiple app gateways over multiple applications and environments. Errors happen and the current implementation of the App gateway API and its Terraform provider prevent us to identity this kind of errors, which is the main objective of IaC and dry run.

@tpcgold
Copy link

tpcgold commented Aug 27, 2024

how is it possible that microsoft still didn't fix this issue? (still with 4.0.1)
it's pretty annoying that if a APP Gateway is depoyed and the "PathBasedRouting" is not working!

e.g. when using it with AKS this leads to the fact that the whole cluster needs to be destroyed in order to destroy and redeploy the gateway!

there is no way one can edit from "Basic" routing to "PathBasedRouting" as it's a deadlock situation (if it doesn't deploy correctly - and it seems to be totally random if or if not it's working with the same script)

@radurobot
Copy link

I'm running into the same issue where Terraform keeps detecting changes in the request_routing_rule blocks because of ordering differences between my config and what's returned from Azure. Would it be possible for Terraform to sort these rules internally so they match up with what we have in our code? That way, the plan wouldn't show changes when there aren't any actual differences.

@dsczltch
Copy link

dsczltch commented Oct 8, 2024

Hi @mbfrahry, I can see this ticket has been assigned to you for more than 2y, could you please give us an ETA?
Could you also please indicate who is the owner of the azurerm app gateway terraform provider?

@Kapsztajn
Copy link

@radurobot You can always assign priority manually and Terraform won't show changes.

@kewalaka
Copy link

  • 4, or will be addressed in new major version of azurerm provider

Just checking, this is still an issue for us (attributes being re-arranged making for a messy plan, but the apply works out fine). Is this still on the cards for AzureRM 4.0 ?

@dsczltch
Copy link

dsczltch commented Oct 25, 2024

I confirm this is a big issue for us.
Azure Application Gateway is the only service where the tf plan is unreadable.
Therefore, we have to do all our updates overnight because we cannot verify our changes in the dry run.
For this reason we are considering to abandon this service (even though we are happy with it in production).

@eissko
Copy link

eissko commented Oct 25, 2024

we did abandon it one year and half ago. and we are happy now.

@cveld
Copy link

cveld commented Oct 27, 2024

@eissko what is your alternative?

@eissko
Copy link

eissko commented Oct 28, 2024

@cveld non-msft solution as virtual appliance. f5/volterra a and its service "Web App & API protection".

@robindv
Copy link

robindv commented Oct 28, 2024

We encountered the same troubles and migrated the app gateway to Bicep. The "what-if" feature of bicep makes it quite comparable to Terraform. The diffs are readable again, thats what counts.

@marcindulak
Copy link

We encountered the same troubles and migrated the app gateway to Bicep. The "what-if" feature of bicep makes it quite comparable to Terraform. The diffs are readable again, thats what counts.

Azure/arm-template-whatif#157

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet