-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Docdb] Master catalog corruption: Backup failed with RuntimeException: ERROR: function yb_catalog_version() does not exist #18507
Comments
On a recent customer cluster we saw the repro of this bug.
We can see that V1, V2, V3 and V4 migration scripts were missing. The function yb_catalog_version is created by V1 migration script. I have found the bug, the relevant code is here:
This function tries to determine which version we should start from, for example, it checks the existence of the table pg_yb_catalog_version to decide whether we need to skip V1__ migration script or not. Let’s look at V1__ script:
This script has two blocks, the first block creates the table pg_yb_catalog_version if it does not exist. The second block creates the function yb_catalog_version. If in an old cluster where pg_yb_catalog_version already existed and is not empty, then according to the above C++ code we will skip V1 migration script entirely, therefore not executing the second block that creates the function yb_catalog_version.
Let’s look at what can happen when an old cluster 2.4.x is upgraded:
If in a 2.4.x cluster, pg_yb_catalog_version table isn’t empty, pg_tablegroup table exists, pg_stat_statements table exists, and function jsonb_path_query exists, then we will skip V1, V2, V3, V4 and that’s the symptom we see on the customer’s cluster. I have started yugabyted on a 2.4 cluster, and all the above are true:
That’s why we skipped the first 4 migration scripts. Leading to this bug. Had the customer cluster was created on a 2.2 release, we will not see this bug because none of the above is true:
|
Further analysis indicates that if the cluster was created on a release that is < 2.4 (e.g., 2.2 or earlier), or > 2.6 (e.g., 2.8 or beyond), we will not have this bug. If < 2.4, then none of those is true so the upgrade will go through V1, V2, ..., without missing any. If > 2.6, then all those hard-coded (V1 to V8) in C++ are true so we should be skipping all of V1 to V8. |
According to the comment, those 8 (V1 to V8) migration scripts represent catalog features released before the ysql_upgrade feature landed. So if we see V1, V2, V3, V4 are not applied, it logically means that they represent features already present in the existing cluster and therefore they should be skipped. The only bug is that the second block in V1 which creates the function |
Summary: A customer reported that backup failed and the reason is that the function `yb_catalog_version()` does not exist. This function is introduced in the migration script `V1__3979__pg_yb_catalog_version.sql`. After debugging I found there is a code bug in the YSQL upgrade code. Specifically: ``` Result<int> GetMajorVersionFromSystemCatalogState(PGConn* pgconn) { int major_version = 0; // Helper macro removing boilerplate. if (VERIFY_RESULT(oneliner_with_result)) { \ ++major_version; \ } else { \ return major_version; \ } // V1: #3979 introducing pg_yb_catalog_version table. INCREMENT_VERSION_OR_RETURN_IT(SystemTableHasRows(pgconn, "pg_yb_catalog_version")) ``` This function tries to determine which version we should start from. For example, it checks the existence of a non-empty table `pg_yb_catalog_version` to decide whether we need to skip V1 migration script or not. The V1 migration script has two blocks: * The first block introduces the table `pg_catalog.pg_yb_catalog_version` * The second block introduces the function `yb_catalog_version()` In a 2.4.x or 2.6.x cluster, `pg_catalog.pg_yb_catalog_version` already exists but `yb_catalog_version()` does not. As a result, the above code will skip V1 migration script and that's why function `yb_catalog_version` isn't introduced. I have verified that any older release < 2.4 (e.g., 2.2.7.0), or any newer release > 2.6 (e.g., 2.8.0.0) do not have this bug. In an older release < 2.4, `pg_catalog.pg_yb_catalog_version` does not exist so the above code will not skip V1 migration script. In a newer release > 2.6, `yb_catalog_version()` also exists so it is correct to skip V1 migration script. The customer cluster was manually fixed by reapplying V1 migration script. I made a fix such that after the normal migration has completed, check whether `yb_catalog_version()` exists or not. If it is missing then apply V1 migration script. Test Plan: (1) ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsByNonSuperuser' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSharedRelsCreatesThemEverywhere' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSharedRelsIsLikeInitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsIsLikeInitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsDontFireTriggers' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsAfterFailure' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#sharedRelsIndexesWork' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemViewsIsLikeInitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#viewReloptionsAreFilteredOnReplace' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#replacingViewKeepsCacheConsistent' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#insertOnConflictWithOidsWorks' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#dmlsUpdatePgCache' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#pinnedObjectsCacheIsUpdated' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotent' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotentSingleConn' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#migratingIsEquivalentToReinitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#migrationInGeoPartitionedSetup' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#migrationFilenameComment' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#invalidUpgradeActions' (2) Download old release packages such as yugabyte-2.2.7.0 yugabyte-2.4.0.0 yugabyte-2.4.8.0 yugabyte-2.6.0.0, yugabyte-2.6.20.0 yugabyte-2.8.0.0 yugabyte-2.8.12.0 Run the following test manually against each of the old releases. For example: ``` ./bin/yb-ctl create --timeout-yb-admin-sec 180 --rf 1 --master_flags initial_sys_catalog_snapshot_path=$HOME/tmp/yugabyte-2.4.0.0/share/initial_sys_catalog_snapshot ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql ``` Look at yb-tserver.INFO ``` W0118 17:16:55.246408 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in template1 I0118 17:16:55.246526 11555 ysql_upgrade.cc:473] template1: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.246542 11555 ysql_upgrade.cc:481] Found pg_global in migration file V1__3979__pg_yb_catalog_version.sql when applying to template1 I0118 17:16:55.341861 11555 ysql_upgrade.cc:517] template1: migration successfully applied without version bump W0118 17:16:55.478569 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in template0 I0118 17:16:55.478693 11555 ysql_upgrade.cc:473] template0: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.563812 11555 ysql_upgrade.cc:517] template0: migration successfully applied without version bump W0118 17:16:55.700487 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in postgres I0118 17:16:55.700606 11555 ysql_upgrade.cc:473] postgres: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.783318 11555 ysql_upgrade.cc:517] postgres: migration successfully applied without version bump W0118 17:16:55.802189 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in yugabyte I0118 17:16:55.802299 11555 ysql_upgrade.cc:473] yugabyte: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.826416 11555 ysql_upgrade.cc:517] yugabyte: migration successfully applied without version bump W0118 17:16:55.845325 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in system_platform I0118 17:16:55.845443 11555 ysql_upgrade.cc:473] system_platform: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.865423 11555 ysql_upgrade.cc:517] system_platform: migration successfully applied without version bump ``` Rerun the upgrade command ``` ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql ``` Look at the yb-tserver.INFO again: ``` I0118 17:34:51.982327 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in template1 I0118 17:34:51.985905 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in template0 I0118 17:34:51.989349 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in postgres I0118 17:34:51.992571 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in yugabyte I0118 17:34:51.997953 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in system_platform ``` Reviewers: jason, tverona Reviewed By: jason Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D31793
…exist Summary: A customer reported that backup failed and the reason is that the function `yb_catalog_version()` does not exist. This function is introduced in the migration script `V1__3979__pg_yb_catalog_version.sql`. After debugging I found there is a code bug in the YSQL upgrade code. Specifically: ``` Result<int> GetMajorVersionFromSystemCatalogState(PGConn* pgconn) { int major_version = 0; // Helper macro removing boilerplate. if (VERIFY_RESULT(oneliner_with_result)) { \ ++major_version; \ } else { \ return major_version; \ } // V1: #3979 introducing pg_yb_catalog_version table. INCREMENT_VERSION_OR_RETURN_IT(SystemTableHasRows(pgconn, "pg_yb_catalog_version")) ``` This function tries to determine which version we should start from. For example, it checks the existence of a non-empty table `pg_yb_catalog_version` to decide whether we need to skip V1 migration script or not. The V1 migration script has two blocks: * The first block introduces the table `pg_catalog.pg_yb_catalog_version` * The second block introduces the function `yb_catalog_version()` In a 2.4.x or 2.6.x cluster, `pg_catalog.pg_yb_catalog_version` already exists but `yb_catalog_version()` does not. As a result, the above code will skip V1 migration script and that's why function `yb_catalog_version` isn't introduced. I have verified that any older release < 2.4 (e.g., 2.2.7.0), or any newer release > 2.6 (e.g., 2.8.0.0) do not have this bug. In an older release < 2.4, `pg_catalog.pg_yb_catalog_version` does not exist so the above code will not skip V1 migration script. In a newer release > 2.6, `yb_catalog_version()` also exists so it is correct to skip V1 migration script. The customer cluster was manually fixed by reapplying V1 migration script. I made a fix such that after the normal migration has completed, check whether `yb_catalog_version()` exists or not. If it is missing then apply V1 migration script. Original commit: 55aac07 / D31793 Jira: DB-7466 Test Plan: (1) ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsByNonSuperuser' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSharedRelsCreatesThemEverywhere' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSharedRelsIsLikeInitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsIsLikeInitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsDontFireTriggers' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsAfterFailure' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#sharedRelsIndexesWork' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemViewsIsLikeInitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#viewReloptionsAreFilteredOnReplace' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#replacingViewKeepsCacheConsistent' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#insertOnConflictWithOidsWorks' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#dmlsUpdatePgCache' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#pinnedObjectsCacheIsUpdated' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotent' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotentSingleConn' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#migratingIsEquivalentToReinitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#migrationInGeoPartitionedSetup' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#migrationFilenameComment' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#invalidUpgradeActions' (2) Download old release packages such as yugabyte-2.2.7.0 yugabyte-2.4.0.0 yugabyte-2.4.8.0 yugabyte-2.6.0.0, yugabyte-2.6.20.0 yugabyte-2.8.0.0 yugabyte-2.8.12.0 Run the following test manually against each of the old releases. For example: ``` ./bin/yb-ctl create --timeout-yb-admin-sec 180 --rf 1 --master_flags initial_sys_catalog_snapshot_path=$HOME/tmp/yugabyte-2.4.0.0/share/initial_sys_catalog_snapshot ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql ``` Look at yb-tserver.INFO ``` W0118 17:16:55.246408 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in template1 I0118 17:16:55.246526 11555 ysql_upgrade.cc:473] template1: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.246542 11555 ysql_upgrade.cc:481] Found pg_global in migration file V1__3979__pg_yb_catalog_version.sql when applying to template1 I0118 17:16:55.341861 11555 ysql_upgrade.cc:517] template1: migration successfully applied without version bump W0118 17:16:55.478569 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in template0 I0118 17:16:55.478693 11555 ysql_upgrade.cc:473] template0: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.563812 11555 ysql_upgrade.cc:517] template0: migration successfully applied without version bump W0118 17:16:55.700487 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in postgres I0118 17:16:55.700606 11555 ysql_upgrade.cc:473] postgres: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.783318 11555 ysql_upgrade.cc:517] postgres: migration successfully applied without version bump W0118 17:16:55.802189 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in yugabyte I0118 17:16:55.802299 11555 ysql_upgrade.cc:473] yugabyte: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.826416 11555 ysql_upgrade.cc:517] yugabyte: migration successfully applied without version bump W0118 17:16:55.845325 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in system_platform I0118 17:16:55.845443 11555 ysql_upgrade.cc:473] system_platform: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.865423 11555 ysql_upgrade.cc:517] system_platform: migration successfully applied without version bump ``` Rerun the upgrade command ``` ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql ``` Look at the yb-tserver.INFO again: ``` I0118 17:34:51.982327 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in template1 I0118 17:34:51.985905 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in template0 I0118 17:34:51.989349 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in postgres I0118 17:34:51.992571 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in yugabyte I0118 17:34:51.997953 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in system_platform ``` Reviewers: jason, tverona Reviewed By: jason Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D31898
…exist Summary: A customer reported that backup failed and the reason is that the function `yb_catalog_version()` does not exist. This function is introduced in the migration script `V1__3979__pg_yb_catalog_version.sql`. After debugging I found there is a code bug in the YSQL upgrade code. Specifically: ``` Result<int> GetMajorVersionFromSystemCatalogState(PGConn* pgconn) { int major_version = 0; // Helper macro removing boilerplate. if (VERIFY_RESULT(oneliner_with_result)) { \ ++major_version; \ } else { \ return major_version; \ } // V1: #3979 introducing pg_yb_catalog_version table. INCREMENT_VERSION_OR_RETURN_IT(SystemTableHasRows(pgconn, "pg_yb_catalog_version")) ``` This function tries to determine which version we should start from. For example, it checks the existence of a non-empty table `pg_yb_catalog_version` to decide whether we need to skip V1 migration script or not. The V1 migration script has two blocks: * The first block introduces the table `pg_catalog.pg_yb_catalog_version` * The second block introduces the function `yb_catalog_version()` In a 2.4.x or 2.6.x cluster, `pg_catalog.pg_yb_catalog_version` already exists but `yb_catalog_version()` does not. As a result, the above code will skip V1 migration script and that's why function `yb_catalog_version` isn't introduced. I have verified that any older release < 2.4 (e.g., 2.2.7.0), or any newer release > 2.6 (e.g., 2.8.0.0) do not have this bug. In an older release < 2.4, `pg_catalog.pg_yb_catalog_version` does not exist so the above code will not skip V1 migration script. In a newer release > 2.6, `yb_catalog_version()` also exists so it is correct to skip V1 migration script. The customer cluster was manually fixed by reapplying V1 migration script. I made a fix such that after the normal migration has completed, check whether `yb_catalog_version()` exists or not. If it is missing then apply V1 migration script. Original commit: 55aac07 / D31793 Jira: DB-7466 Test Plan: (1) ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsByNonSuperuser' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSharedRelsCreatesThemEverywhere' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSharedRelsIsLikeInitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsIsLikeInitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsDontFireTriggers' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsAfterFailure' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#sharedRelsIndexesWork' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemViewsIsLikeInitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#viewReloptionsAreFilteredOnReplace' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#replacingViewKeepsCacheConsistent' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#insertOnConflictWithOidsWorks' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#dmlsUpdatePgCache' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#pinnedObjectsCacheIsUpdated' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotent' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotentSingleConn' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#migratingIsEquivalentToReinitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#migrationInGeoPartitionedSetup' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#migrationFilenameComment' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#invalidUpgradeActions' (2) Download old release packages such as yugabyte-2.2.7.0 yugabyte-2.4.0.0 yugabyte-2.4.8.0 yugabyte-2.6.0.0, yugabyte-2.6.20.0 yugabyte-2.8.0.0 yugabyte-2.8.12.0 Run the following test manually against each of the old releases. For example: ``` ./bin/yb-ctl create --timeout-yb-admin-sec 180 --rf 1 --master_flags initial_sys_catalog_snapshot_path=$HOME/tmp/yugabyte-2.4.0.0/share/initial_sys_catalog_snapshot ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql ``` Look at yb-tserver.INFO ``` W0118 17:16:55.246408 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in template1 I0118 17:16:55.246526 11555 ysql_upgrade.cc:473] template1: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.246542 11555 ysql_upgrade.cc:481] Found pg_global in migration file V1__3979__pg_yb_catalog_version.sql when applying to template1 I0118 17:16:55.341861 11555 ysql_upgrade.cc:517] template1: migration successfully applied without version bump W0118 17:16:55.478569 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in template0 I0118 17:16:55.478693 11555 ysql_upgrade.cc:473] template0: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.563812 11555 ysql_upgrade.cc:517] template0: migration successfully applied without version bump W0118 17:16:55.700487 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in postgres I0118 17:16:55.700606 11555 ysql_upgrade.cc:473] postgres: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.783318 11555 ysql_upgrade.cc:517] postgres: migration successfully applied without version bump W0118 17:16:55.802189 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in yugabyte I0118 17:16:55.802299 11555 ysql_upgrade.cc:473] yugabyte: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.826416 11555 ysql_upgrade.cc:517] yugabyte: migration successfully applied without version bump W0118 17:16:55.845325 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in system_platform I0118 17:16:55.845443 11555 ysql_upgrade.cc:473] system_platform: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.865423 11555 ysql_upgrade.cc:517] system_platform: migration successfully applied without version bump ``` Rerun the upgrade command ``` ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql ``` Look at the yb-tserver.INFO again: ``` I0118 17:34:51.982327 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in template1 I0118 17:34:51.985905 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in template0 I0118 17:34:51.989349 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in postgres I0118 17:34:51.992571 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in yugabyte I0118 17:34:51.997953 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in system_platform ``` Reviewers: jason, tverona Reviewed By: jason Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D31929
…exist Summary: A customer reported that backup failed and the reason is that the function `yb_catalog_version()` does not exist. This function is introduced in the migration script `V1__3979__pg_yb_catalog_version.sql`. After debugging I found there is a code bug in the YSQL upgrade code. Specifically: ``` Result<int> GetMajorVersionFromSystemCatalogState(PGConn* pgconn) { int major_version = 0; // Helper macro removing boilerplate. if (VERIFY_RESULT(oneliner_with_result)) { \ ++major_version; \ } else { \ return major_version; \ } // V1: #3979 introducing pg_yb_catalog_version table. INCREMENT_VERSION_OR_RETURN_IT(SystemTableHasRows(pgconn, "pg_yb_catalog_version")) ``` This function tries to determine which version we should start from. For example, it checks the existence of a non-empty table `pg_yb_catalog_version` to decide whether we need to skip V1 migration script or not. The V1 migration script has two blocks: * The first block introduces the table `pg_catalog.pg_yb_catalog_version` * The second block introduces the function `yb_catalog_version()` In a 2.4.x or 2.6.x cluster, `pg_catalog.pg_yb_catalog_version` already exists but `yb_catalog_version()` does not. As a result, the above code will skip V1 migration script and that's why function `yb_catalog_version` isn't introduced. I have verified that any older release < 2.4 (e.g., 2.2.7.0), or any newer release > 2.6 (e.g., 2.8.0.0) do not have this bug. In an older release < 2.4, `pg_catalog.pg_yb_catalog_version` does not exist so the above code will not skip V1 migration script. In a newer release > 2.6, `yb_catalog_version()` also exists so it is correct to skip V1 migration script. The customer cluster was manually fixed by reapplying V1 migration script. I made a fix such that after the normal migration has completed, check whether `yb_catalog_version()` exists or not. If it is missing then apply V1 migration script. Original commit: 55aac07 / D31793 Jira: DB-7466 Test Plan: (1) ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsByNonSuperuser' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSharedRelsCreatesThemEverywhere' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSharedRelsIsLikeInitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsIsLikeInitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsDontFireTriggers' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemRelsAfterFailure' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#sharedRelsIndexesWork' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#creatingSystemViewsIsLikeInitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#viewReloptionsAreFilteredOnReplace' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#replacingViewKeepsCacheConsistent' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#insertOnConflictWithOidsWorks' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#dmlsUpdatePgCache' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#pinnedObjectsCacheIsUpdated' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotent' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotentSingleConn' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#migratingIsEquivalentToReinitdb' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#migrationInGeoPartitionedSetup' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#migrationFilenameComment' ./yb_build.sh release --sj --java-test 'org.yb.pgsql.TestYsqlUpgrade#invalidUpgradeActions' (2) Download old release packages such as yugabyte-2.2.7.0 yugabyte-2.4.0.0 yugabyte-2.4.8.0 yugabyte-2.6.0.0, yugabyte-2.6.20.0 yugabyte-2.8.0.0 yugabyte-2.8.12.0 Run the following test manually against each of the old releases. For example: ``` ./bin/yb-ctl create --timeout-yb-admin-sec 180 --rf 1 --master_flags initial_sys_catalog_snapshot_path=$HOME/tmp/yugabyte-2.4.0.0/share/initial_sys_catalog_snapshot ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql ``` Look at yb-tserver.INFO ``` W0118 17:16:55.246408 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in template1 I0118 17:16:55.246526 11555 ysql_upgrade.cc:473] template1: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.246542 11555 ysql_upgrade.cc:481] Found pg_global in migration file V1__3979__pg_yb_catalog_version.sql when applying to template1 I0118 17:16:55.341861 11555 ysql_upgrade.cc:517] template1: migration successfully applied without version bump W0118 17:16:55.478569 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in template0 I0118 17:16:55.478693 11555 ysql_upgrade.cc:473] template0: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.563812 11555 ysql_upgrade.cc:517] template0: migration successfully applied without version bump W0118 17:16:55.700487 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in postgres I0118 17:16:55.700606 11555 ysql_upgrade.cc:473] postgres: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.783318 11555 ysql_upgrade.cc:517] postgres: migration successfully applied without version bump W0118 17:16:55.802189 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in yugabyte I0118 17:16:55.802299 11555 ysql_upgrade.cc:473] yugabyte: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.826416 11555 ysql_upgrade.cc:517] yugabyte: migration successfully applied without version bump W0118 17:16:55.845325 11555 ysql_upgrade.cc:438] Function yb_catalog_version is missing in system_platform I0118 17:16:55.845443 11555 ysql_upgrade.cc:473] system_platform: applying migration 'V1__3979__pg_yb_catalog_version.sql' I0118 17:16:55.865423 11555 ysql_upgrade.cc:517] system_platform: migration successfully applied without version bump ``` Rerun the upgrade command ``` ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql ``` Look at the yb-tserver.INFO again: ``` I0118 17:34:51.982327 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in template1 I0118 17:34:51.985905 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in template0 I0118 17:34:51.989349 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in postgres I0118 17:34:51.992571 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in yugabyte I0118 17:34:51.997953 11558 ysql_upgrade.cc:443] Found function yb_catalog_version in system_platform ``` Reviewers: jason, tverona Reviewed By: jason Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D31939
Summary: To reproduce the bug, follow these steps 1. Download old release package for yugabyte-2.6.9.0 2. ./bin/yb-ctl create --rf 1 --master_flags initial_sys_catalog_snapshot_path=$HOME/tmp/yugabyte-2.6.9.0/share/initial_sys_catalog_snapshot 3. ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql use_single_connection Step 3 fails with an error like: ``` Error running upgrade_ysql: Internal error (yb/yql/pgwrapper/ysql_upgrade.cc:259): Unable to upgrade YSQL cluster: New connection is requested before the old one is released! ``` In YSQL upgrade single connection mode, we assert that there cannot be an existing connection when trying to make a new connection to a given database. A recent commit 55aac07 introduced a bug that violated this assertion. The relevant code: ``` 450 // Fix for #18507: 451 // This bug only shows up when upgrading from 2.4.x or 2.6.x, in these 452 // two releases the table pg_yb_catalog_version exists but the function 453 // yb_catalog_version does not exist. However we have skipped V1 because 454 // SystemTableHasRows(pgconn, "pg_yb_catalog_version") returns true. 455 for (auto& entry : databases) { 456 auto conn = VERIFY_RESULT(entry->GetConnection()); 457 if (!VERIFY_RESULT(FunctionExists(conn.get(), "yb_catalog_version"))) { 458 LOG(WARNING) << "Function yb_catalog_version is missing in " << entry->database_name_; 459 // Run V1 migration script to introduce function "yb_catalog_version". 460 const Version version = {0, 0}; 461 RETURN_NOT_OK(MigrateOnce(entry.get(), &version)); 462 } else { 463 LOG(INFO) << "Found function yb_catalog_version in " << entry->database _name_; 464 } 465 } ``` In particular, at line 456 we already get a connection, before `conn` goes out of scope (therefore it is still alive), inside `MigrateOnce`, we make another call to `GetConnection` at line 474 below: ``` 470 Status YsqlUpgradeHelper::MigrateOnce(DatabaseEntry* db_entry, const Version* historical_version) { 471 const auto& db_name = db_entry->database_name_; 472 const auto& version = historical_version ? *historical_version : db_entry->version_; 473 474 auto pgconn = VERIFY_RESULT(db_entry->GetConnection()); ``` It is the second call to `GetConnection` that throws the exception in single connection mode. I fixed the bug by adding a new function `HasYbCatalogVersion` that can avoid making a second call to `GetConnection` before the first one goes out of scope. Jira: DB-11198 Test Plan: (1) ./yb_build.sh release --java-test 'org.yb.pgsql.TestYsqlUpgrade' (2) The test described in the summary, the upgrade succeeded. Reviewers: jason, kfranz Reviewed By: jason Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D34805
…connection mode Summary: To reproduce the bug, follow these steps 1. Download old release package for yugabyte-2.6.9.0 2. ./bin/yb-ctl create --rf 1 --master_flags initial_sys_catalog_snapshot_path=$HOME/tmp/yugabyte-2.6.9.0/share/initial_sys_catalog_snapshot 3. ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql use_single_connection Step 3 fails with an error like: ``` Error running upgrade_ysql: Internal error (yb/yql/pgwrapper/ysql_upgrade.cc:259): Unable to upgrade YSQL cluster: New connection is requested before the old one is released! ``` In YSQL upgrade single connection mode, we assert that there cannot be an existing connection when trying to make a new connection to a given database. A recent commit 55aac07 introduced a bug that violated this assertion. The relevant code: ``` 450 // Fix for #18507: 451 // This bug only shows up when upgrading from 2.4.x or 2.6.x, in these 452 // two releases the table pg_yb_catalog_version exists but the function 453 // yb_catalog_version does not exist. However we have skipped V1 because 454 // SystemTableHasRows(pgconn, "pg_yb_catalog_version") returns true. 455 for (auto& entry : databases) { 456 auto conn = VERIFY_RESULT(entry->GetConnection()); 457 if (!VERIFY_RESULT(FunctionExists(conn.get(), "yb_catalog_version"))) { 458 LOG(WARNING) << "Function yb_catalog_version is missing in " << entry->database_name_; 459 // Run V1 migration script to introduce function "yb_catalog_version". 460 const Version version = {0, 0}; 461 RETURN_NOT_OK(MigrateOnce(entry.get(), &version)); 462 } else { 463 LOG(INFO) << "Found function yb_catalog_version in " << entry->database _name_; 464 } 465 } ``` In particular, at line 456 we already get a connection, before `conn` goes out of scope (therefore it is still alive), inside `MigrateOnce`, we make another call to `GetConnection` at line 474 below: ``` 470 Status YsqlUpgradeHelper::MigrateOnce(DatabaseEntry* db_entry, const Version* historical_version) { 471 const auto& db_name = db_entry->database_name_; 472 const auto& version = historical_version ? *historical_version : db_entry->version_; 473 474 auto pgconn = VERIFY_RESULT(db_entry->GetConnection()); ``` It is the second call to `GetConnection` that throws the exception in single connection mode. I fixed the bug by adding a new function `HasYbCatalogVersion` that can avoid making a second call to `GetConnection` before the first one goes out of scope. Jira: DB-11198 Original commit: edfb99f / D34805 Test Plan: (1) ./yb_build.sh release --java-test 'org.yb.pgsql.TestYsqlUpgrade' (2) The test described in the summary, the upgrade succeeded. Reviewers: jason, kfranz Reviewed By: jason Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D34837
…nnection mode Summary: To reproduce the bug, follow these steps 1. Download old release package for yugabyte-2.6.9.0 2. ./bin/yb-ctl create --rf 1 --master_flags initial_sys_catalog_snapshot_path=$HOME/tmp/yugabyte-2.6.9.0/share/initial_sys_catalog_snapshot 3. ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql use_single_connection Step 3 fails with an error like: ``` Error running upgrade_ysql: Internal error (yb/yql/pgwrapper/ysql_upgrade.cc:259): Unable to upgrade YSQL cluster: New connection is requested before the old one is released! ``` In YSQL upgrade single connection mode, we assert that there cannot be an existing connection when trying to make a new connection to a given database. A recent commit 55aac07 introduced a bug that violated this assertion. The relevant code: ``` 450 // Fix for #18507: 451 // This bug only shows up when upgrading from 2.4.x or 2.6.x, in these 452 // two releases the table pg_yb_catalog_version exists but the function 453 // yb_catalog_version does not exist. However we have skipped V1 because 454 // SystemTableHasRows(pgconn, "pg_yb_catalog_version") returns true. 455 for (auto& entry : databases) { 456 auto conn = VERIFY_RESULT(entry->GetConnection()); 457 if (!VERIFY_RESULT(FunctionExists(conn.get(), "yb_catalog_version"))) { 458 LOG(WARNING) << "Function yb_catalog_version is missing in " << entry->database_name_; 459 // Run V1 migration script to introduce function "yb_catalog_version". 460 const Version version = {0, 0}; 461 RETURN_NOT_OK(MigrateOnce(entry.get(), &version)); 462 } else { 463 LOG(INFO) << "Found function yb_catalog_version in " << entry->database _name_; 464 } 465 } ``` In particular, at line 456 we already get a connection, before `conn` goes out of scope (therefore it is still alive), inside `MigrateOnce`, we make another call to `GetConnection` at line 474 below: ``` 470 Status YsqlUpgradeHelper::MigrateOnce(DatabaseEntry* db_entry, const Version* historical_version) { 471 const auto& db_name = db_entry->database_name_; 472 const auto& version = historical_version ? *historical_version : db_entry->version_; 473 474 auto pgconn = VERIFY_RESULT(db_entry->GetConnection()); ``` It is the second call to `GetConnection` that throws the exception in single connection mode. I fixed the bug by adding a new function `HasYbCatalogVersion` that can avoid making a second call to `GetConnection` before the first one goes out of scope. Jira: DB-11198 Original commit: edfb99f / D34805 Test Plan: (1) ./yb_build.sh release --java-test 'org.yb.pgsql.TestYsqlUpgrade' (2) The test described in the summary, the upgrade succeeded. Reviewers: jason, kfranz Reviewed By: jason Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D34846
…nnection mode Summary: To reproduce the bug, follow these steps 1. Download old release package for yugabyte-2.6.9.0 2. ./bin/yb-ctl create --rf 1 --master_flags initial_sys_catalog_snapshot_path=$HOME/tmp/yugabyte-2.6.9.0/share/initial_sys_catalog_snapshot 3. ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql use_single_connection Step 3 fails with an error like: ``` Error running upgrade_ysql: Internal error (yb/yql/pgwrapper/ysql_upgrade.cc:259): Unable to upgrade YSQL cluster: New connection is requested before the old one is released! ``` In YSQL upgrade single connection mode, we assert that there cannot be an existing connection when trying to make a new connection to a given database. A recent commit 55aac07 introduced a bug that violated this assertion. The relevant code: ``` 450 // Fix for #18507: 451 // This bug only shows up when upgrading from 2.4.x or 2.6.x, in these 452 // two releases the table pg_yb_catalog_version exists but the function 453 // yb_catalog_version does not exist. However we have skipped V1 because 454 // SystemTableHasRows(pgconn, "pg_yb_catalog_version") returns true. 455 for (auto& entry : databases) { 456 auto conn = VERIFY_RESULT(entry->GetConnection()); 457 if (!VERIFY_RESULT(FunctionExists(conn.get(), "yb_catalog_version"))) { 458 LOG(WARNING) << "Function yb_catalog_version is missing in " << entry->database_name_; 459 // Run V1 migration script to introduce function "yb_catalog_version". 460 const Version version = {0, 0}; 461 RETURN_NOT_OK(MigrateOnce(entry.get(), &version)); 462 } else { 463 LOG(INFO) << "Found function yb_catalog_version in " << entry->database _name_; 464 } 465 } ``` In particular, at line 456 we already get a connection, before `conn` goes out of scope (therefore it is still alive), inside `MigrateOnce`, we make another call to `GetConnection` at line 474 below: ``` 470 Status YsqlUpgradeHelper::MigrateOnce(DatabaseEntry* db_entry, const Version* historical_version) { 471 const auto& db_name = db_entry->database_name_; 472 const auto& version = historical_version ? *historical_version : db_entry->version_; 473 474 auto pgconn = VERIFY_RESULT(db_entry->GetConnection()); ``` It is the second call to `GetConnection` that throws the exception in single connection mode. I fixed the bug by adding a new function `HasYbCatalogVersion` that can avoid making a second call to `GetConnection` before the first one goes out of scope. Jira: DB-11198 Original commit: edfb99f / D34805 Test Plan: (1) ./yb_build.sh release --java-test 'org.yb.pgsql.TestYsqlUpgrade' (2) The test described in the summary, the upgrade succeeded. Reviewers: jason, kfranz Reviewed By: jason Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D34869
…nnection mode Summary: To reproduce the bug, follow these steps 1. Download old release package for yugabyte-2.6.9.0 2. ./bin/yb-ctl create --rf 1 --master_flags initial_sys_catalog_snapshot_path=$HOME/tmp/yugabyte-2.6.9.0/share/initial_sys_catalog_snapshot 3. ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql use_single_connection Step 3 fails with an error like: ``` Error running upgrade_ysql: Internal error (yb/yql/pgwrapper/ysql_upgrade.cc:259): Unable to upgrade YSQL cluster: New connection is requested before the old one is released! ``` In YSQL upgrade single connection mode, we assert that there cannot be an existing connection when trying to make a new connection to a given database. A recent commit 55aac07 introduced a bug that violated this assertion. The relevant code: ``` 450 // Fix for #18507: 451 // This bug only shows up when upgrading from 2.4.x or 2.6.x, in these 452 // two releases the table pg_yb_catalog_version exists but the function 453 // yb_catalog_version does not exist. However we have skipped V1 because 454 // SystemTableHasRows(pgconn, "pg_yb_catalog_version") returns true. 455 for (auto& entry : databases) { 456 auto conn = VERIFY_RESULT(entry->GetConnection()); 457 if (!VERIFY_RESULT(FunctionExists(conn.get(), "yb_catalog_version"))) { 458 LOG(WARNING) << "Function yb_catalog_version is missing in " << entry->database_name_; 459 // Run V1 migration script to introduce function "yb_catalog_version". 460 const Version version = {0, 0}; 461 RETURN_NOT_OK(MigrateOnce(entry.get(), &version)); 462 } else { 463 LOG(INFO) << "Found function yb_catalog_version in " << entry->database _name_; 464 } 465 } ``` In particular, at line 456 we already get a connection, before `conn` goes out of scope (therefore it is still alive), inside `MigrateOnce`, we make another call to `GetConnection` at line 474 below: ``` 470 Status YsqlUpgradeHelper::MigrateOnce(DatabaseEntry* db_entry, const Version* historical_version) { 471 const auto& db_name = db_entry->database_name_; 472 const auto& version = historical_version ? *historical_version : db_entry->version_; 473 474 auto pgconn = VERIFY_RESULT(db_entry->GetConnection()); ``` It is the second call to `GetConnection` that throws the exception in single connection mode. I fixed the bug by adding a new function `HasYbCatalogVersion` that can avoid making a second call to `GetConnection` before the first one goes out of scope. Jira: DB-11198 Original commit: edfb99f / D34805 Test Plan: (1) ./yb_build.sh release --java-test 'org.yb.pgsql.TestYsqlUpgrade' (2) The test described in the summary, the upgrade succeeded. Reviewers: jason, kfranz Reviewed By: jason Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D34885
Summary: To reproduce the bug, follow these steps 1. Download old release package for yugabyte-2.6.9.0 2. ./bin/yb-ctl create --rf 1 --master_flags initial_sys_catalog_snapshot_path=$HOME/tmp/yugabyte-2.6.9.0/share/initial_sys_catalog_snapshot 3. ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql use_single_connection Step 3 fails with an error like: ``` Error running upgrade_ysql: Internal error (yb/yql/pgwrapper/ysql_upgrade.cc:259): Unable to upgrade YSQL cluster: New connection is requested before the old one is released! ``` In YSQL upgrade single connection mode, we assert that there cannot be an existing connection when trying to make a new connection to a given database. A recent commit 55aac07 introduced a bug that violated this assertion. The relevant code: ``` 450 // Fix for #18507: 451 // This bug only shows up when upgrading from 2.4.x or 2.6.x, in these 452 // two releases the table pg_yb_catalog_version exists but the function 453 // yb_catalog_version does not exist. However we have skipped V1 because 454 // SystemTableHasRows(pgconn, "pg_yb_catalog_version") returns true. 455 for (auto& entry : databases) { 456 auto conn = VERIFY_RESULT(entry->GetConnection()); 457 if (!VERIFY_RESULT(FunctionExists(conn.get(), "yb_catalog_version"))) { 458 LOG(WARNING) << "Function yb_catalog_version is missing in " << entry->database_name_; 459 // Run V1 migration script to introduce function "yb_catalog_version". 460 const Version version = {0, 0}; 461 RETURN_NOT_OK(MigrateOnce(entry.get(), &version)); 462 } else { 463 LOG(INFO) << "Found function yb_catalog_version in " << entry->database _name_; 464 } 465 } ``` In particular, at line 456 we already get a connection, before `conn` goes out of scope (therefore it is still alive), inside `MigrateOnce`, we make another call to `GetConnection` at line 474 below: ``` 470 Status YsqlUpgradeHelper::MigrateOnce(DatabaseEntry* db_entry, const Version* historical_version) { 471 const auto& db_name = db_entry->database_name_; 472 const auto& version = historical_version ? *historical_version : db_entry->version_; 473 474 auto pgconn = VERIFY_RESULT(db_entry->GetConnection()); ``` It is the second call to `GetConnection` that throws the exception in single connection mode. I fixed the bug by adding a new function `HasYbCatalogVersion` that can avoid making a second call to `GetConnection` before the first one goes out of scope. Jira: DB-11198 Test Plan: (1) ./yb_build.sh release --java-test 'org.yb.pgsql.TestYsqlUpgrade' (2) The test described in the summary, the upgrade succeeded. Reviewers: jason, kfranz Reviewed By: jason Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D34805
…connection mode Summary: To reproduce the bug, follow these steps 1. Download old release package for yugabyte-2.6.9.0 2. ./bin/yb-ctl create --rf 1 --master_flags initial_sys_catalog_snapshot_path=$HOME/tmp/yugabyte-2.6.9.0/share/initial_sys_catalog_snapshot 3. ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql use_single_connection Step 3 fails with an error like: ``` Error running upgrade_ysql: Internal error (yb/yql/pgwrapper/ysql_upgrade.cc:259): Unable to upgrade YSQL cluster: New connection is requested before the old one is released! ``` In YSQL upgrade single connection mode, we assert that there cannot be an existing connection when trying to make a new connection to a given database. A recent commit 55aac07 introduced a bug that violated this assertion. The relevant code: ``` 450 // Fix for #18507: 451 // This bug only shows up when upgrading from 2.4.x or 2.6.x, in these 452 // two releases the table pg_yb_catalog_version exists but the function 453 // yb_catalog_version does not exist. However we have skipped V1 because 454 // SystemTableHasRows(pgconn, "pg_yb_catalog_version") returns true. 455 for (auto& entry : databases) { 456 auto conn = VERIFY_RESULT(entry->GetConnection()); 457 if (!VERIFY_RESULT(FunctionExists(conn.get(), "yb_catalog_version"))) { 458 LOG(WARNING) << "Function yb_catalog_version is missing in " << entry->database_name_; 459 // Run V1 migration script to introduce function "yb_catalog_version". 460 const Version version = {0, 0}; 461 RETURN_NOT_OK(MigrateOnce(entry.get(), &version)); 462 } else { 463 LOG(INFO) << "Found function yb_catalog_version in " << entry->database _name_; 464 } 465 } ``` In particular, at line 456 we already get a connection, before `conn` goes out of scope (therefore it is still alive), inside `MigrateOnce`, we make another call to `GetConnection` at line 474 below: ``` 470 Status YsqlUpgradeHelper::MigrateOnce(DatabaseEntry* db_entry, const Version* historical_version) { 471 const auto& db_name = db_entry->database_name_; 472 const auto& version = historical_version ? *historical_version : db_entry->version_; 473 474 auto pgconn = VERIFY_RESULT(db_entry->GetConnection()); ``` It is the second call to `GetConnection` that throws the exception in single connection mode. I fixed the bug by adding a new function `HasYbCatalogVersion` that can avoid making a second call to `GetConnection` before the first one goes out of scope. Jira: DB-11198 Original commit: edfb99f / D34805 Test Plan: (1) ./yb_build.sh release --java-test 'org.yb.pgsql.TestYsqlUpgrade' (2) The test described in the summary, the upgrade succeeded. Reviewers: jason, kfranz Reviewed By: jason Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D34837
…nnection mode Summary: To reproduce the bug, follow these steps 1. Download old release package for yugabyte-2.6.9.0 2. ./bin/yb-ctl create --rf 1 --master_flags initial_sys_catalog_snapshot_path=$HOME/tmp/yugabyte-2.6.9.0/share/initial_sys_catalog_snapshot 3. ./build/latest/bin/yb-admin --timeout_ms=720000 upgrade_ysql use_single_connection Step 3 fails with an error like: ``` Error running upgrade_ysql: Internal error (yb/yql/pgwrapper/ysql_upgrade.cc:259): Unable to upgrade YSQL cluster: New connection is requested before the old one is released! ``` In YSQL upgrade single connection mode, we assert that there cannot be an existing connection when trying to make a new connection to a given database. A recent commit 55aac07 introduced a bug that violated this assertion. The relevant code: ``` 450 // Fix for #18507: 451 // This bug only shows up when upgrading from 2.4.x or 2.6.x, in these 452 // two releases the table pg_yb_catalog_version exists but the function 453 // yb_catalog_version does not exist. However we have skipped V1 because 454 // SystemTableHasRows(pgconn, "pg_yb_catalog_version") returns true. 455 for (auto& entry : databases) { 456 auto conn = VERIFY_RESULT(entry->GetConnection()); 457 if (!VERIFY_RESULT(FunctionExists(conn.get(), "yb_catalog_version"))) { 458 LOG(WARNING) << "Function yb_catalog_version is missing in " << entry->database_name_; 459 // Run V1 migration script to introduce function "yb_catalog_version". 460 const Version version = {0, 0}; 461 RETURN_NOT_OK(MigrateOnce(entry.get(), &version)); 462 } else { 463 LOG(INFO) << "Found function yb_catalog_version in " << entry->database _name_; 464 } 465 } ``` In particular, at line 456 we already get a connection, before `conn` goes out of scope (therefore it is still alive), inside `MigrateOnce`, we make another call to `GetConnection` at line 474 below: ``` 470 Status YsqlUpgradeHelper::MigrateOnce(DatabaseEntry* db_entry, const Version* historical_version) { 471 const auto& db_name = db_entry->database_name_; 472 const auto& version = historical_version ? *historical_version : db_entry->version_; 473 474 auto pgconn = VERIFY_RESULT(db_entry->GetConnection()); ``` It is the second call to `GetConnection` that throws the exception in single connection mode. I fixed the bug by adding a new function `HasYbCatalogVersion` that can avoid making a second call to `GetConnection` before the first one goes out of scope. Jira: DB-11198 Original commit: edfb99f / D34805 Test Plan: (1) ./yb_build.sh release --java-test 'org.yb.pgsql.TestYsqlUpgrade' (2) The test described in the summary, the upgrade succeeded. Reviewers: jason, kfranz Reviewed By: jason Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D34846
Jira Link: DB-7466
The text was updated successfully, but these errors were encountered: