-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Redis] Internal Network API Bug #1347
Comments
Hey @sarangan12 Just an FYI that this looks similar to a previous bug with Virtual Network Gateway's, I'm wondering it's related? #1233 Thanks! |
This is a service side issue. I am assigning this issue to @TimLovellSmith and @hrishi18pathak to look in to the issue |
👋 hey @TimLovellSmith / @hrishi18pathak Is there any update on this issue, or a rough timeframe for when this is likely to be fixed? Thanks! |
@tombuildsstuff |
@TimLovellSmith thanks for the update - is there a timeframe for when a new API version will be released? Based on your description, I'm assuming that would be duplicating the API version and updating the behaviour, rather than including additional functionality - and as such should be relatively quick? Unfortunately the workaround specified above isn't ideal since it opens us up to other issues where the deletion failing is valid (such as permissions, Locking or the HyperV networking/locking errors) and we'll have to keep track of which error are retry-able and which aren't - and as such I'd prefer to avoid that if at all possible. Thanks! |
@tombuildsstuff Sorry, still no definite timeframe on that change.. I'm not familiar with the hyper-v networking/locking errors you mention, what are those like? For permissions, it would fail with 401 or 403 status code, rather than 400. You can most definitely distinguish this particular scenario via the 'InUseSubnetCannotBeDeleted' status in the error response. Which as you say, is a lot better than retrying blindly - although I understand its undesirable to do the work of writing such a special case. /cc @JonCole |
Suggest closing as 'probably won't fix' - note unfortunately if we change the behavior to be more transparent it is still going to take just as long to delete the subnet, and its not that hard to workaround, so it is just not a very high priority to fix it. |
Hi @TimLovellSmith, so just to be clear we will be able to use terraform to create the azure redis service passing in it's own subnet and static IP, we just won't be able to delete the said subnet with terraform? I just want to confirm that at least this part is functional and if so is that in the azure provider version 1.0.1 or a future to be released version? cc; @tombuildsstuff Thanks! |
Hi @bostonmoto I am not familiar with terraform, but I imagine this as most likely manifesting as a bug that deleting a subnet that holds a redis cache would appear to fail, and then need to be retried after a suitable delay, at which time it would succeed. |
Not a problem @TimLovellSmith I will wait for @tombuildsstuff to respond. Just trying to confirm that the creation part works with static IP and subnet. If delete doesn't work we'll just do that manually after as needed. |
@tombuildsstuff I came up with one other idea for allowing workarounds in code that could possibly avoid the need for a new api version: Could we address this by returning a Azure-AsyncOperation header as part of the DELETE 200 response? And then you could poll on the URL in that header to wait for the delete work to really be completed. We could probably roll this out faster than a new api version. But would it work for you? /cc @bostonmoto |
@TimLovellSmith I have a feeling that would break the Azure SDK - @marstr / @mcardosos / @jhendrixMSFT / @joshgav should be able to comment further @sarangan12 this functionality (internal subnets) currently isn't supported in Terraform since we're waiting for the API to be fixed - once it is we should be able to add support for this relatively quickly (there's a branch with support - but we can't ship it until this issue is resolved) Thanks! |
Defining the operation as long running operation would introduce breaking changes in the go SDK, yes. |
But maybe it's the correct thing to do? This came up a few days ago talking with @johanste |
I think it is the correct thing to do. But yes, breaking changes. |
Since the delete operation is already marked as a long running operation in swagger (from what I can tell), there wouldn't be a breaking change in the Go SDK. I don't know if any other management libraries would be tricked into doing something functionally incorrect by a 200 response w. an AzureAsync-Operation header - but if the operation will take a significant time to complete, usages where the client was fine with fire & forget (doesn't need the resource to actually be deleted), we risk making a call that previously was instantaneous (relatively speaking :)) into a 15 minute call (based on @TimLovellSmith 's earlier comments in the thread). Which would be sub-optimal. I believe that the long term solution would be to:
In addition, I suspect that 409 for the subnet delete call failure would be better understood by most clients. And possibly (not standard for a 409) include a retry-after header telling the client that it may have better luck in the not-too-distant future. |
@mcardosos could we leave this open until the fix is deployed and verified? At this point the new API is still broken since the fix isn't deployed, based on my reading of this comment: #2581 (comment)
Thanks! |
@tombuildsstuff It is deployed. That comment was several days old. |
@mcardosos Can we close it again? |
@tombuildsstuff feel free to reopen if not fixed yet |
Hi @tombuildsstuff so we can now not only create azure redis with subnet via terraform but we can also delete it properly now. Is that confirmed? Thanks! |
@bostonmoto we still need to upgrade to the new version of the Azure SDK for Go, and then add rebase the branch with support for this functionality to be able to confirm that it, so it isn't at the moment, but should be soon. I'd suggest subscribing to this issue for updates: hashicorp/terraform-provider-azurerm#82 :) |
👋 howdy!
I've been trying to add support for Redis on the Internal Subnets to Terraform.
Creating a Redis instance on an Internal Subnet works great, however there's a bug in the API when deleting it:
which returns:
Querying for the Redis Cache at this point then returns a 404:
Thus, it appears safe to delete the Subnet:
However attempting to delete the Subnet at this point returns an error:
From what I can see this is a bug in the Redis API which is exposing the Redis Cache as deleted when it's really still kicking around. Would it be possible to fix this bug in the API so that the Redis Cache is returned in a
Deleting
state until it's deleted like the other resources do?Thanks!
The text was updated successfully, but these errors were encountered: