Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Invalid index key specified" for Cosmos DB collections after running TF plan without any code changes. #8144

Closed
hkailantzis opened this issue Aug 17, 2020 · 24 comments · Fixed by #14857

Comments

@hkailantzis
Copy link

hkailantzis commented Aug 17, 2020

After upgrading successfully to azure rm provider 2.20.0, on 14.08.2020, TF plan/apply pipeline ran successfully showing no infra changes. Starting today 17.08, any subsequent tf plans fails with following error even without any code changes.
"Invalid index key specified" for Cosmos DB collections.
See below for full error details. Any hints on why this might be happening ? Thank you in advance.

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

Terraform v0.12.26
provider.azurerm v2.20.0

Affected Resource(s)

  • azurerm_cosmosdb_mongo_collection

Terraform Configuration Files

# Copy-paste your Terraform configurations here - for large Terraform configs,
# please use a service like Dropbox and share a link to the ZIP file. For
# security, you can also encrypt the files using our GPG public key: https://keybase.io/hashicorp

Debug Output

Panic Output

Expected Behavior

TF plan should ran again successfully since last run.

Actual Behavior

TF plan fails for all of the cosmosDB collections with:

Error: Error reading Cosmos Mongo Collection "test" (Account: "cosmos-db-test", Database: "test"): documentdb.MongoDBResourcesClient#GetMongoDBCollection: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="BadRequest" Message="Invalid index key specified.\r\nActivityId: 76c8c64d-cb9c-4862-90aa-2004b805b5ea, Microsoft.Azure.Documents.Common/2.11.0"

Steps to Reproduce

  1. terraform plan

Important Factoids

References

  • #0000
@bbung
Copy link

bbung commented Aug 17, 2020

Same issue with versions:
Terraform v0.12.17
provider.azurerm v2.23.0

@hkailantzis
Copy link
Author

hi @bbung. Did the problem for you also started after you did a provider upgrade > 2.20.0 or just after a certain date ?

@bbung
Copy link

bbung commented Aug 18, 2020

Hi @hkailantzis, Just yesterday. I did not do any provider upgrade. I updated the azurerm provider from 2.20.0 to 2.23.0 after the error occured, to make sure to use the latest.

@hkailantzis
Copy link
Author

thanks @bbung for the info. Thinking that maybe something changed on cosmos API/Azure side that tf provider doesn't like/support related with indexes/collections. Not sure how to tackle this, and this is blocking our infrastructure deployment pipeline right now.

@bbung
Copy link

bbung commented Aug 18, 2020

I actually have a newer CosmosDB Account that is not failing. It is exactly configured as the ones failing, was just created three weeks later. I did not recognize this before. So i think you are right. I've contacted Azure Support on this and keep you posted if i get any new findigns.

@akokshar
Copy link

Terraform debug gave me AzureRM Request which I then used with a command az rest --method get --uri "https://https://management.azure.com/subscriptions/..../mongodbDatabases/test/collections/test?api-version=2020-04-01".
this command failed with:
BadRequest({"code":"BadRequest","message":"Invalid index key specified.\r\nActivityId: 6bd419d9-5c43-4476-99d2-a01fc5344862, Microsoft.Azure.Documents.Common/2.11.0"})
However, If I change api-version in a request to '2015-04-08', az rest command works fine.

Do not know how it can be worked around :/

@yupwei68
Copy link
Contributor

Hi @hkailantzis , thanks for opening this issue. I have tried an example of azurerm_cosmosdb_mongo_collection in Azurerm v2.23.0. I could created it with success. But a known issue is that we could not get the "default_ttl_seconds" back, which the service team is currently fixing. That is shown in after terraform apply it with success, the terraform plan is not empty.
Sorry that I could not reproduce your error. Would you mind providing your hcl configuration?

@hkailantzis
Copy link
Author

hkailantzis commented Aug 19, 2020

Hi @yupwei68, thanks for the reply. According to our findings as described in the previous comments, seems that there are compatibilities issues with the new cosmos API introduced in v2.20.0 and preexisting cosmos dbs accounts/collections. Please check previous comment from @akokshar, where az rest command returns successfully for api 2015-04-08 but fails with 2020-04-01. We will open a ticket to Azure as well, and see what they tell us regarding Cosmos API newer changes. Regarding providing TF configuration, code is quite spread into multiple files + based on a config map, so will not make much sense to post any. In addition, we can't rollback the azure rm provider, since when we try to downgrade to 2.19.0, TF complains with Error: Resource instance managed by newer provider version.

@bbung
Copy link

bbung commented Aug 20, 2020

Hi,

i've been in contact with the Azure Support and they "disabled the fix now". I've tested and it now works.

Maybe it is a good idea to make the azurerm provider capable of setting the ARM Template Version as the older version works fine. I will look into it and maybe create an feature request.

Cheers,
Björn

@hkailantzis
Copy link
Author

HI @bbung thanks for the feedback!. what does it mean exactly the 'disabled the fix now' ??. is this by account and not globally ? we re-ran TF plan with v2.20.0 and we are experiencing same error...

@bbung
Copy link

bbung commented Aug 20, 2020

Hey @hkailantzis,

that is was they said and it works for me now. I do not know if it is somehow account specific, sorry!

I would suggest raising a support request with MS. There should be a free plan option if you do not have a paid plan.

Cheers

@tombuildsstuff tombuildsstuff added the upstream/microsoft Indicates that there's an upstream issue blocking this issue/PR label Aug 20, 2020
@tombuildsstuff
Copy link
Contributor

cc @JeffreyRichter - this is an example of a breaking behavioural change to the CosmosDB API

@JeffreyRichter
Copy link

If I read this right, this seems more like an unintentional bug than an intentional breaking change. Am I reading this right?

@avdhut1990
Copy link

Hi @bbung, @hkailantzis , would it be possible the details that need to be shared with Azure support to resolve this issue?

Thanks.

@hkailantzis
Copy link
Author

hi @avdhut1990, just sharing the TF error and the link to this issue would be enough for starters.

@hkailantzis
Copy link
Author

hkailantzis commented Aug 24, 2020

ok , so we got a reply from Azure support regarding our current DB collections:
Please note that for Mongo version 3.6 accounts with API version 2020-04-01 and later, "_id" is a required index. An index with a single key, equal to _id should be present: "key": {"keys": ["_id"]} Please update your workload to ensure the collection contains this required index.
We're going to try this and see how it goes.

@tombuildsstuff
Copy link
Contributor

@JeffreyRichter I mean this comment sounds like a breaking behavioural change to me - there's no documentation to this effect that I've seen?

This is a good example of a Generic API housing multiple "specialized" products with incompatible payloads - so catching these breaking behavioural changes is incredibly hard - whilst technically this is a new API version (and so these are allowed), there's no context for if a new API version is a breaking change or not, and no details of these behavioural changes (that I've seen in a consistent location) - compared to say with Semver where this is immediately clear?

@jackofallops
Copy link
Member

ok , so we got a reply from Azure support regarding our current DB collections:
Please note that for Mongo version 3.6 accounts with API version 2020-04-01 and later, "_id" is a required index. An index with a single key, equal to _id should be present: "key": {"keys": ["_id"]} Please update your workload to ensure the collection contains this required index.
We're going to try this and see how it goes.

Hi @hkailantzis - Did you manage to resolve the issue using the feedback from support? We're just discussing how to address this in the provider, since it's likely to represent a breaking change there to compensate for the apparent behavioural change in the API. Options include:

  • Don't fix, but add to documentation that this key is now required to be included
  • insert the new required key on the user's behalf if missing (possibly silently)
  • make the index property Required and validate that the set includes the new required key in the set.

The last I think is the better option, despite representing a breaking change in the resource as it keeps all the settings in the open, rather than masking behaviour.

WDYT?

@hkailantzis
Copy link
Author

hkailantzis commented Aug 27, 2020

Hi @jackofallops,

we tried to include it our TF scripts and enforce it with tf apply -refresh=false, as tf plan was failing because it couldnt' read the collections to begin with, but then this fix couldn't be applied to all collections because of issues with certain shard keys. However, in our case we managed to rollback to 2.19.0, so temporarily we're unblocked and we have already planned to move away from cosmos all together :-). so after this is done, we can upgrade to 2.20.0. Well in the end, we got a reply from Azure Support that a fix should be out in the next two days, but don't have any more details.

My concern is however, with future provider upgrades, where API is changing, e.g. postgres in v2.23.0, among others, if something similar happens with existing resources and breaking changes on the API...

@Oechiih
Copy link

Oechiih commented Sep 2, 2020

@hkailantzis A colleague of mine has experienced the exact same issue last night. So a fix has not yet been published. @jackofallops suggestion makes sense to me.

@avdhut1990
Copy link

Hi @hkailantzis , @jackofallops ,

We received the following RCA from MS:

Previously, our RP APIs for databases/collections of a Mongo 3.6 account had a few behavior discrepancies when compared to the corresponding data-plane APIs. We deployed a fix to eliminate these discrepancies so that the API fully adheres to Mongo 3.6 behavior. This fix caused 2 potential issues that some customers may have experienced:

  1. "Invalid index key specified" error.
    This is a bug on our side and unintentional. It affects some collections that had a _ts index. We fixed the bug so the customer should no longer experience it anymore.

  2. When specifying custom indexes on a Mongo 3.6 collection, an "_id" index is required. If custom indexes are specified but "_id" is not present, we will return a bad request error.
    Previously, we have allowed the "_id" index to be omitted for Mongo 3.6 accounts, but this is unintentional and causes discrepancy with data-plane API for index management. The fix we deployed now enforces "_id" index to be included.

They resolved the issue with our affected accounts from backend, however suggested adding the "_id" key explicitly only when using custom indexes.

@jackofallops
Copy link
Member

Thanks for the update @avdhut1990

We'll look to try and add some validation to the resource creation to help prevent this issue in future and update the docs to reflect the new stricter requirement on the service side.

@favoretti favoretti added good first issue and removed upstream/microsoft Indicates that there's an upstream issue blocking this issue/PR labels Aug 20, 2021
pivotal-marcela-campo added a commit to cloudfoundry/csb-brokerpak-azure that referenced this issue Nov 18, 2021
- explicitely enable mongo API through capabilities. Without this there is an
account recreation on update
- add required _id index. Some context for this change in hashicorp/terraform-provider-azurerm#8144 (comment)

[#180197227](https://www.pivotaltracker.com/story/show/180197227)
blgm pushed a commit to cloudfoundry/csb-brokerpak-azure that referenced this issue Nov 30, 2021
- explicitely enable mongo API through capabilities. Without this there is an
account recreation on update
- add required _id index. Some context for this change in hashicorp/terraform-provider-azurerm#8144 (comment)

[#180197227](https://www.pivotaltracker.com/story/show/180197227)
jimbo459 pushed a commit to cloudfoundry/csb-brokerpak-azure that referenced this issue Nov 30, 2021
* Fix issues with mongodb update

- explicitely enable mongo API through capabilities. Without this there is an
account recreation on update
- add required _id index. Some context for this change in hashicorp/terraform-provider-azurerm#8144 (comment)

[#180197227](https://www.pivotaltracker.com/story/show/180197227)

* test: mongodb update plan

[#180363944](https://www.pivotaltracker.com/story/show/180363944)

Co-authored-by: George Blue <[email protected]>
@github-actions github-actions bot added this to the v2.92.0 milestone Jan 10, 2022
@github-actions
Copy link

This functionality has been released in v2.92.0 of the Terraform Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet