Skip to content

Commit

Permalink
[#13600] DocDB: Fix Flakiness in the XClusterAlterUniverse* tests
Browse files Browse the repository at this point in the history
Summary:
The tests XClusterAlterUniverseAdminCliTest.TestAlterUniverseReplication* were failing because they were performing an AlterUniverseReplication using yb-admin which internally calls "IsSetupReplicationDone" on the XCluster universe with the .ALTER suffix. At the end of "AlterUniverseReplication", the universe with .ALTER is deleted and there is a code path which does not set the replication_error field in the PB correctly before returning. This results in the field retaining its default value and the yb-admin command failing with the following error:

Bad status: Runtime error (yb/util/subprocess.cc:615): Subprocess '/Users/sandeep/code/yugabyte-db/build/debug-clang-dynamic-arm64-ninja/tests-tools/../bin/yb-admin' terminated with non-zero exit status 256

The internal error is: Runtime error (yb/common/wire_protocol.cc:253): (999 unknown) which comes the default value of AppStatusPB error code of 999.

Fixed this by setting the replicationerror correctly

Test Plan: ybd --cxx-test yb-admin-test_ent --gtest_filter XClusterAlterUniverseAdminCliTest.TestAlterUniverseReplication -n 10

Reviewers: nicolas, rahuldesirazu, hsunder

Reviewed By: hsunder

Subscribers: ybase, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D18999
  • Loading branch information
lingamsandeep committed Aug 18, 2022
1 parent 6baa881 commit 011d3e6
Showing 1 changed file with 6 additions and 2 deletions.
8 changes: 6 additions & 2 deletions ent/src/yb/master/catalog_manager_ent.cc
Original file line number Diff line number Diff line change
Expand Up @@ -6145,7 +6145,7 @@ Status CatalogManager::GetUniverseReplication(const GetUniverseReplicationReques
* Checks if the universe replication setup has completed.
* Returns Status::OK() if this call succeeds, and uses resp->done() to determine if the setup has
* completed (either failed or succeeded). If the setup has failed, then resp->replication_error()
* is also set.
* is also set. If it succeeds, replication_error() gets set to OK.
*/
Status CatalogManager::IsSetupUniverseReplicationDone(
const IsSetupUniverseReplicationDoneRequestPB* req,
Expand All @@ -6168,7 +6168,11 @@ Status CatalogManager::IsSetupUniverseReplicationDone(
// If the universe was deleted, we're done. This is normal with ALTER tmp files.
if (s.IsNotFound()) {
resp->set_done(true);
return isAlterRequest ? Status::OK() : s;
if (isAlterRequest) {
s = Status::OK();
StatusToPB(s, resp->mutable_replication_error());
}
return s;
}
RETURN_NOT_OK(s);
if (universe_resp.has_error()) {
Expand Down

0 comments on commit 011d3e6

Please sign in to comment.