Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Apply fails with error InvalidParameterValue: Serverless v2 maximum capacity 0.0 isn't valid. The maximum capacity must be at least 1.0. when capacity is not specified #40473

Closed
jackkates opened this issue Dec 6, 2024 · 17 comments · Fixed by #40511
Assignees
Labels
bug Addresses a defect in current functionality. regression Pertains to a degraded workflow resulting from an upstream patch or internal enhancement. service/rds Issues and PRs that pertain to the rds service.
Milestone

Comments

@jackkates
Copy link

jackkates commented Dec 6, 2024

Terraform Core Version

1.0.8

AWS Provider Version

5.80.0

Affected Resource(s)

  • aws_rds_cluster

Expected Behavior

The apply should succeed even if config for ServerlessV2ScalingConfiguration is not specified.

Actual Behavior

The apply failed with an error saying that a serverless maximum capacity 0.0 isn't valid. However, we did not specify a serverless capacity. It appears an empty/null value is being converted into 0.0.

Relevant Error/Panic Output Snippet

│ Error: updating RDS Cluster (x): operation error RDS: ModifyDBCluster, https response error StatusCode: 400, RequestID: 224793e7-094b-48aa-a2c9-f986f28c6c1e, api error InvalidParameterValue: Serverless v2 maximum capacity 0.0 isn't valid. The maximum capacity must be at least 1.0.

│   with module.x.module.service_rds_cluster.aws_rds_cluster.postgresql_cluster,
│   on .terraform/modules/x.service_rds_cluster/main.tf line 13, in resource "aws_rds_cluster" "postgresql_cluster":
│   13: resource "aws_rds_cluster" "postgresql_cluster" {

Terraform Configuration Files

cannot provide

Steps to Reproduce

observed in our system, don't have exact repro steps

Debug Output

No response

Panic Output

No response

Important Factoids

The plan includes this change which I don't totally understand since as said before, our db is not serverless.
image

We saw the same change in the plan 1 week ago, but the apply was successful and only failed upon upgrade to v5.80.0

No response

References

#40230

potentially related to this recent PR since the failure started after upgrade to v5.80.0. Maybe the removal of the checks for 0 in that PR broke something or made it so you can't have null serverless settings?

Would you like to implement a fix?

None

@jackkates jackkates added the bug Addresses a defect in current functionality. label Dec 6, 2024
Copy link

github-actions bot commented Dec 6, 2024

Community Note

Voting for Prioritization

  • Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
  • Please see our prioritization guide for information on how we prioritize.
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

  • If you are interested in working on this issue, please leave a comment.
  • If this would be your first contribution, please review the contribution guide.

@github-actions github-actions bot added service/rds Issues and PRs that pertain to the rds service. needs-triage Waiting for first response or review from a maintainer. labels Dec 6, 2024
@jackkates
Copy link
Author

cc @ohookins Hi, saw your PR reducing the limit to 0, any idea what might be going on here?

@ohookins
Copy link
Contributor

ohookins commented Dec 6, 2024

It seems to correlate, although I don't see how it could be happening if you don't define that block at all:

if d.HasChange("serverlessv2_scaling_configuration") {
if v, ok := d.GetOk("serverlessv2_scaling_configuration"); ok && len(v.([]interface{})) > 0 && v.([]interface{})[0] != nil {
input.ServerlessV2ScalingConfiguration = expandServerlessV2ScalingConfiguration(v.([]interface{})[0].(map[string]interface{}))
}
}

If the block isn't defined it shouldn't even try to do anything with it.

Can you share your config?

@pioneer2k
Copy link

Maybe @jackkates did the same mistake that I did. I had a dynamic block for the serverless configuration that was deactivated for non-serverless databases. And I had the same "terraform plan" all the last months where the capacity should be changed to "null". And the Terraform Provider versions 5.79.0 and before accepted that and did simply nothing and in the next run of Terraform the plan showed me the same things again.
To make it short: @jackkates should change his configuration so that the serverless configuration is always used, even for non-serverless databases, with a default like min=0.5 and max 1.0

@ewbankkit
Copy link
Contributor

As this may strictly speaking be a regression (we aim to be 100% backwards compatible with existing configurations), could @pioneer2k or @jackkates add in a simple configuration that exhibits the problem. Thanks.

@pioneer2k
Copy link

Here is my code block that I was using with version 5.79.0 and before.

dynamic "serverlessv2_scaling_configuration" {
      for_each = compact([var.serverless_v2_enabled])
      content {
          max_capacity  = var.serverless_v2_max_capacity
          min_capacity  = var.serverless_v2_min_capacity
      }

If var.serverless_v2_enabled was set, then everything always worked fine and the parameters max_capacity and min_capacity were set to a valid value, even on non-serverless databases.
If var.serverless_v2_enabled was not set, the terraform plan was showing what you can see in the original issue (setting max_capacity and min_capacity to null), but that was never applied and terraform plan was showing it on every run.
With version 5.80.0 setting the parameters min_capacity and min_capacity to null is not allowed anymore (which is strictly speaking correct and the previous versions should have done it in the same way).

@jackkates
Copy link
Author

jackkates commented Dec 6, 2024

Okay I brushed up on Aurora Serverless v2 and now recall that they switched back to using EngineMode: provisioned for serverlessv2. so even a db with all non-serverless instances could still have a ServerlessV2ScalingConfiguration. In fact, I see one on my database when I run aws rds describe-db-clusters on it.

How such a config got onto our database, I'm not sure (we don't have a dynamic block for this parameter). But seems to be same as the issue described above where we saw that change in the plan and then now have an apply error trying to remove(?) the config.

I was able to recover the API request terraform made through Cloudtrail and confirm that it sent a zero for both

    "requestParameters": {
        "dBClusterIdentifier": "x",
        "allowEngineModeChange": false,
        "allowMajorVersionUpgrade": false,
        "serverlessV2ScalingConfiguration": {
            "maxCapacity": 0,
            "minCapacity": 0
        },
        "applyImmediately": false,
        "dBInstanceParameterGroupName": "service-aurora-db-postgres14-tuned"
    },

It doesn't seem correct to zero the values out instead of removing it. Not sure if you actually can remove the whole parameter, maybe can call ModifyDBCluster with empty object for that param?

@jackkates
Copy link
Author

should change his configuration so that the serverless configuration is always used, even for non-serverless databases, with a default like min=0.5 and max 1.0

This sounds like it would solve it, although I think it is still a regression. I think that if you

  1. create a aurora cluster with terraform
  2. either through tf, or outside of tf, give it a ServerlessV2ScalingConfiguration parameter
  3. remove the ServerlessV2ScalingConfiguration from the terraform (or, if you applied it out of band, use exactly the same tf config)
  4. tf apply

seems step 4 will succeed on v5.79.0 (it will just not send anything for the value of ServerlessV2ScalingConfiguration), but in v5.80.0 it will fail (sending an invalid value with maximum capacity of 0)

I'll try to actually test that.

@jackkates
Copy link
Author

jackkates commented Dec 6, 2024

I think that #40230 has a bug. I think the new scaling to 0 capability only means that that the minimum MinCapacity value is 0, but the minimum MaxCapacity value is still 1.0. Unfortunately the AWS docs are not clear on this matter, but the Serverless v2 maximum capacity 0.0 isn't valid. The maximum capacity must be at least 1.0. definitely seems to suggest that.

The removal of the check for v != 0.0 seems to allow a null or empty value for MaxCapacity to be converted into an input of 0.0. I'm not very familiar with the code and how it comes out of tfMap and how the type conversion occurs on this line https://github.com/ohookins/terraform-provider-aws/blob/bbc3f5ae0fb2ae272b2d244dd73291c2cea8c3fd/internal/service/rds/cluster.go#L2139-L2141

@jackkates jackkates changed the title [Bug]: Apply fails with error InvalidParameterValue: Serverless v2 maximum capacity 0.0 isn't valid. The maximum capacity must be at least 1.0. on a non-serverless database [Bug]: Apply fails with error InvalidParameterValue: Serverless v2 maximum capacity 0.0 isn't valid. The maximum capacity must be at least 1.0. when capacity is not specified Dec 6, 2024
@jackkates
Copy link
Author

jackkates commented Dec 6, 2024

I've updated the title/description to reflect that the issue isn't really that the database is non-serverless (a cluster can have a serverlessV2ScalingConfiguration set even if none of the instances are serverless). The issue seems like that applying a config with no setting for serverlessV2ScalingConfiguration fails due to the conversion of the setting's absence into 0.0 which is an invalid setting.

Kind of tricky because it seems like you maybe can't actually remove the setting on aws. The docs say if you don't set it, you can't add any serverless instances.

Until you specify the capacity range for your cluster, you can't add any Aurora Serverless v2 DB instances to the cluster using the AWS CLI or RDS API.

But they don't say anything about removing the setting. Therefore tf provider should probably just choose to not send any values, that seems like the only thing it can do. Sending 0.0 as started happening in v5.80.0 will just cause errors. (The user can manually add valid values but it still's breaking a tf config that worked on previous version)

I think just restoring the restriction on MaxCapacity to be at least 0.5 would work. Allow 0 MinCapacity but not 0 MaxCapacity to match AWS. (It seems like for MaxCapacity the actual valid minimum value is 1.0 not 0.5, but that's a pre-existing bug and less of an issue because that only affects someone who explicitly sets 0.5, as opposed to this issue).

@Chili-Man
Copy link
Contributor

I asked AWS support about this issue and this is what they said:

Kindly know that this is an expected behavior where this parameter stays intact for a Serverless v2 cluster, even if all the serverless v2 instances have been remoed and only provisioned instances are present, making the cluster a provisioned cluster now.

The following excerpt is mentioned in our AWS documentation [1]:

"Any capacity range that you previously specified for the cluster remains in place, even if all Aurora Serverless v2 DB instances are removed from the cluster. If you want to change the capacity range, you can modify the cluster, as explained in Setting the Aurora Serverless v2 capacity range for a cluster."

[1] Converting an Aurora Serverless v2 writer or reader to provisioned: [https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2-administration.html#aurora-serverless-v2-converting-to-provisioned ](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2-administration.html#aurora-serverless-v2-converting-to-provisioned)

I tested the same in my test environment and got similar results. Unfortunately, there is no way to get around it for existing clusters. This has been raised with out internal teams and they have confirmed that this configuration of Serverless V2 artifact cannot be removed even after changing all instances from serverless to Provisioned. Please take note that the metadata of a cluster is generated upon creation. Hence, the issue is coming up.

The only way to update the metadata is to recreate the cluster. Please note that the auto-scaling metadata will have no impact on the current database cluster.

On behalf of AWS, I sincerely apologize for the inconvenience this design limitation might have caused you.

@justinretzolk justinretzolk added regression Pertains to a degraded workflow resulting from an upstream patch or internal enhancement. and removed needs-triage Waiting for first response or review from a maintainer. labels Dec 10, 2024
@terraform-aws-provider terraform-aws-provider bot added the prioritized Part of the maintainer teams immediate focus. To be addressed within the current quarter. label Dec 10, 2024
@YakDriver YakDriver self-assigned this Dec 10, 2024
Copy link

Warning

This issue has been closed, meaning that any additional comments are hard for our team to see. Please assume that the maintainers will not see them.

Ongoing conversations amongst community members are welcome, however, the issue will be locked after 30 days. Moving conversations to another venue, such as the AWS Provider forum, is recommended. If you have additional concerns, please open a new issue, referencing this one where needed.

@github-actions github-actions bot added this to the v5.81.0 milestone Dec 10, 2024
@github-actions github-actions bot removed the prioritized Part of the maintainer teams immediate focus. To be addressed within the current quarter. label Dec 12, 2024
Copy link

This functionality has been released in v5.81.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@Fodoj
Copy link

Fodoj commented Dec 16, 2024

There is another issue popping up in 5.81 now:

│ Error: updating RDS Cluster (xxxx): operation error RDS: ModifyDBCluster, https response error StatusCode: 400, RequestID: xxxxxxxx, api error InvalidParameterValue: Auto-pause isn't supported by engine version 16.1, please upgrade to a supported version first.

@pioneer2k
Copy link

When you use code that is mentioned in the first post of this issue or directly set the minimum capacity of the serverlessv2 config to 0.0 (also called "auto-pause"), then this is only allowed when you use specific engine versions of you database cluster. In your case you need to upgrade from 16.1 to 16.3. Please see the docs of AWS: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2-auto-pause.html

@Fodoj
Copy link

Fodoj commented Dec 16, 2024

@pioneer2k except that I don't even use Aurora Serverless. This instance used to be aurora serverless, but was later switched back to regular. It's a continuation of this bug - #32381 (I know that one is on AWS), but it's also something that was not happening pre 5.81.

@kelcya
Copy link

kelcya commented Dec 17, 2024

I just ran into the same error that @Fodoj mentioned. Our instance was serverless and later converted back to instance based.

I am pinning the aws provider to 5.79.0 for now, not ideal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Addresses a defect in current functionality. regression Pertains to a degraded workflow resulting from an upstream patch or internal enhancement. service/rds Issues and PRs that pertain to the rds service.
Projects
None yet
9 participants