-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Plan upgrades for Azure SQL Failover group fail or partially complete #20
Comments
We have created an issue in Pivotal Tracker to manage this: https://www.pivotaltracker.com/story/show/177493831 The labels on this github issue will be updated when the story is started. |
Some more information: If I create a new FOG DB service instance and try to update the plan, it will only update the plan for the primary DB in Azure. If I then manually update the plan for the secondary DB to match, and then create a new subsume instance for the existing DB, a subsequent plan update works successfully and updates the plans for both DBs in Azure. |
I'm not sure how this could be accomplished in terraform - during creation, the primary has to be created first, then the secondary is created referencing the primary. During update, that order would have to be reversed which seems impossible. Also, be careful updating across sku families, they are destructive in some cases. We'll do some investigation and see how this can be improved. |
@erniebilling A similar mechanism is implemented in Terraform for Postgres here |
I don't see any way to make this work in terraform. The problem only occurs when you try to upgrade across DTU families (basic to standard for instance.) If I try to make the secondary database depend on the primary (so it updates first) terraform fails with a circular dependency. Best option at this point is to only offer plans within the same family, so upgrades can function. Other option would be for azurerm provider to be aware of what is going on and do things in the right order, but I'm not sure if that is possible. |
@erniebilling the easiest way to fix this would be to implement it in the azure terraform provider. As mentioned before, I've implemented a similar pattern for Postgres! Please file an issue like this one, I am interested in fixing it. |
@erniebilling Even plan updates within the same family don't work as expected though, it only updates the primary DB. |
@claassen a bug in the azurerm provider prevented updating the secondary db size, it looks like it has been fixed and will be included in the next release. |
Issue hashicorp/terraform-provider-azurerm#11282 has been closed, so I've taken a look to see whether this moves things forward. TLDR: it's fixed I was able to re-create hashicorp/terraform-provider-azurerm#11282 using the following commands. There seems to have been a change in the names of the plans since this issue was raised. Also the error message that I re-created was the same as in hashicorp/terraform-provider-azurerm#11282, but not precisely the same as at the top of this issue.
In a development environment I updated the Terraform provider (azurerm) to version 2.57.0 (which included the fix for hashicorp/terraform-provider-azurerm#11282). I also had to remove the Terraform line that sets the size of the secondary DB. This is because the size of the secondary DB should be computed, and the Terraform provider 2.57.0 now fails when this is specified. Line removal diff:
After building the brokerpak and deploying, I was unable to reproduce the issue using the steps documented above. Previously the error occurred 3 times in 10 updates, and now it failed 0 times in 10 updates. This is suggestive that the latest version of the Terraform provider has fixed the issue. The next steps will be to release a new version of the brokerpak. |
Thanks @blgm! Nice to see that the work payed off 😄 |
@blgm are the azure provider 2.57, and the other changes merged? |
Hi @alex-tw-lam, they are not yet. |
@blgm can we keep the issue open? We want to be notified when a fix is merged. |
Upgrading within the same plan is now fixed in terms of actually updating both the primary and secondary DB instance, but still seeing issues upgrading from Basic to Standard tier with the latest release (1.0.0-rc39):
|
Yeah, tested that as well and didn't implement the solution yet, as it's not clear whether Azure supports it: https://docs.microsoft.com/en-us/azure/azure-sql/database/active-geo-replication-overview#upgrading-or-downgrading-primary-database. If you have a working example how to do it with |
@aristosvo I'm not familiar with the internal workings of Terraform to know if how feasible this is or not, but it seems like the solution should be as simple as just ensuring the secondary DB is updated before the primary. Our workaround has been to manually update the secondary DB in Azure, followed by the primary in order to do plan updates across tiers. |
Okay, looks like I have a solution here, but I need to test this extensively as this may contain a few edge cases. I'm especially not experienced with elastic pools, so if you have experience scaling from single database to pools in FOG, let me know. |
I've created a new issue #47 to continue the conversation since this issue was closed. |
Describe the bug
When upgrading Azure SQL failover group service instance plans we noticed 2 cases where things do not complete as expected:
Error: waiting for creation of MsSql Database "csb-fog-db-5d4a9d17-2252-4561-b800-7c7817503682" (MsSql Server Name "azugessqlprod" / Resource Group "MFC-CAC-PCF-PSB"): Code="SourceDatabaseEditionCouldNotBeUpgraded" Message="The source database
'azugessqlprod.csb-fog-db-5d4a9d17-2252-4561-b800-7c7817503682' cannot have higher edition than the target database 'azugessqlprodcae.csb-fog-db-5d4a9d17-2252-4561-b800-7c7817503682'. Upgrade the edition on the target before upgrading source." on main.tf line
15, in resource "azurerm_mssql_database" "primary_db": 15: resource "azurerm_mssql_database" "primary_db" { exit status 1
To Reproduce
Steps to reproduce the behavior:
cf create-service csb-azure-mssql-db-failover-group Basic basic-db
cf update-service basic-db -p StandardS0
cf create-service csb-azure-mssql-db-failover-group StandardS0 standard-db
cf update-service standard-db -p StandardS1
Verify in Azure that both databases have been updated
Expected behavior
Plans updates should complete successfully and both databases in the failover group should be updated to the specified plan in Azure.
Additional context
BROKER_VERSION=0.2.3
BROKERPAK_VERSION=1.0.0-rc.38
According to Microsoft, when upgrading the service plan for databases in failover groups, the secondary database should be upgraded first, followed by the primary database. Possible the broker is not following this order leading to these issues.
For the case where the update fails completely when upgrading from Basic to Standard tier plan, we were able to successfully update the plans directly in Azure afterwords by updating the secondary DB first.
The text was updated successfully, but these errors were encountered: