Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDCSDKConsistentStreamTest#TestCDCSDKConsistentStreamWithTabletSplit fails intermittently #20778

Closed
yugabyte-ci opened this issue Jan 25, 2024 · 0 comments
Assignees
Labels
area/cdcsdk CDC SDK jira-originated kind/bug This issue is a bug priority/high High Priority

Comments

@yugabyte-ci
Copy link
Contributor

yugabyte-ci commented Jan 25, 2024

Jira Link: DB-9776

@yugabyte-ci yugabyte-ci added area/cdcsdk CDC SDK jira-originated kind/bug This issue is a bug priority/high High Priority status/awaiting-triage Issue awaiting triage labels Jan 25, 2024
@yugabyte-ci yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Jan 30, 2024
devansh-ism pushed a commit that referenced this issue Mar 7, 2024
Summary:
There were several instances of test runs failures with FlushTables client call timing out
caused by https://yugabyte.atlassian.net/browse/DB-9925. The fix for the root issue has been
landed as part of https://phorge.dev.yugabyte.com/D32215.

This diff will add retry function as an wrapper over FlushTables call to retry on specific error.
Jira: DB-9776

Test Plan: Jenkins: test regex: .*CDCSDK.*

Reviewers: skumar, stiwary, siddharth.shah

Reviewed By: skumar

Subscribers: ycdcxcluster

Differential Revision: https://phorge.dev.yugabyte.com/D32781
jasonyb pushed a commit that referenced this issue Apr 23, 2024
…ter-merge

Merge YB master commit 1f64b6e titled

    [#20778] CDCSDK: Add retry logic over FlushTables calls in test

and committed 2024-03-07T13:26:11+05:30 into YB pg15.

- TestPgAlterTableChangePrimaryKey.java: YB master
  d5d7363 deletes this file, claiming
  relevant tests were moved to regress test yb_alter_table_rewrite.  YB
  pg15 branch made minor adjustments such as message changes.  Delete
  it.
- TestPgReplicationSlot.java: YB master
  ddbe411 takes test
  replicationConnectionConsumption and turns it into helper
  testReplicationConnectionConsumption which is called by
  replicationConnectionConsumption and
  replicationConnectionConsumptionMultipleBatches later in the file.  YB
  pg15 417e9b3 ignores test
  replicationConnectionConsumption, so do the same ignore to both tests.
- parallel.c: YB master cfda8c7 changes
  spacing for enumblacklistlen, but that declaration was removed in YB
  pg15, so ignore it.
- tablecmds.c:
  - AlteredTableInfo: trivial conflict on adding extra fields at the
    bottom with YB pg15 and YB master
    d5d7363.
  - function declarations: YB master
    d5d7363 adds extra parameter
    yb_wqueue to two functions.  YB pg15 has several changes in the
    area.  Trivial resolution.
  - ATRewriteTables: trivial conflict between YB pg15 merge
    5627af5 and YB master
    d5d7363 regarding both sides adding
    params to function make_new_heap.
  - YbATSetPKRewriteChildPartitions: YB master
    d5d7363 adds heap_open/heap_close,
    but upstream PG switched to table_open/table_close.
- slot.c: conflict between YB pg15 merge
  417e9b3 and YB master
  e5dbd2e.  Resolution appears to be to
  change "slot" to "s" and take YB master's change.
- slotfuncs.c:
  - pg_create_logical_replication_slot: adjacent lines conflict between
    YB master e5dbd2e and YB pg15 merge
    ed96733.
  - pg_get_replication_slots: merge YB master
    e5dbd2e and YB pg15 merge
    ed96733.
- ipci.c: upstream PG 0bd305ee1d427ef29f5fa4fa20567e3b3f5ff792 moves
  code into CalculateShmemSize.  YB
  bda4da7 touches code there.  Move
  that code to the new location.
- pg_dump.c: YB master 5659b73 adds
  use_roles_sql logic while upstream PG changes the area.  Merge.
- event_trigger.h: YB master d5d7363
  adds YB_AT_REWRITE_ALTER_PRIMARY_KEY as 0x16 while upstream PG
  578b229718e8f15fa779e20f086c4b6bb3776106 and
  b0483263dda0824cc49e3f8a022dab07e1cdf9a7 modify definitions for 0x08.
  Trivial merge.
- pgstat.h/wait_event.h: YB master
  bda4da7 touches
  yb_pgstat_report_wait_start, but YB pg15 merge
  5627af5 moves the function to
  wait_event.h.  Apply the modification there, and also move the
  associated yb_ash.h include.
- pgstat.c/wait_event.c: YB master
  bda4da7 touches functions formerly in
  pgstat.c now moved to wait_event.c.  Apply the same changes to the new
  location using

      sed -i 's/YBEnableAsh()/yb_ash_enable_infra/' src/postgres/src/backend/utils/activity/wait_event.c

- guc.c: YB master bda4da7 adds ASH
  GUCs with config groups RESOURCES and STATS_COLLECTOR that upstream PG
  removes.  Choose config group STATS_MONITORING for both GUCs.
- pg15_tests/passing_tests.tsv:
  - remove org.yb.pgsql.TestPgAlterTableChangePrimaryKey from the list
    as that test was deleted by YB master
    d5d7363.
  - remove org.yb.pgsql.TestPgRegressReplicationSlot and
    CDCSDKYsqlTest.TestReplicationSlotDropWithActiveInvalid from the
    list because these tests start failing due to actual bugs.  The
    source of the bugs is unclear, but in order to push forward merge
    progress, leave deeper investigation for later.
- pg15_tests/test_regress_table.sh: YB master
  d5d7363 adds new regress test which
  suffers the same serial type bug as yb_alter_table.  Add it to the
  list of tests expected to fail.
jasonyb pushed a commit that referenced this issue Apr 23, 2024
…dd3f76d' into pg15

Summary:
Merge YB master commit 1f64b6e titled

    [#20778] CDCSDK: Add retry logic over FlushTables calls in test

and committed 2024-03-07T13:26:11+05:30 into YB pg15.

- TestPgAlterTableChangePrimaryKey.java: YB master
  d5d7363 deletes this file, claiming
  relevant tests were moved to regress test yb_alter_table_rewrite.  YB
  pg15 branch made minor adjustments such as message changes.  Delete
  it.
- TestPgReplicationSlot.java: YB master
  ddbe411 takes test
  replicationConnectionConsumption and turns it into helper
  testReplicationConnectionConsumption which is called by
  replicationConnectionConsumption and
  replicationConnectionConsumptionMultipleBatches later in the file.  YB
  pg15 417e9b3 ignores test
  replicationConnectionConsumption, so do the same ignore to both tests.
- parallel.c: YB master cfda8c7 changes
  spacing for enumblacklistlen, but that declaration was removed in YB
  pg15, so ignore it.
- tablecmds.c:
  - AlteredTableInfo: trivial conflict on adding extra fields at the
    bottom with YB pg15 and YB master
    d5d7363.
  - function declarations: YB master
    d5d7363 adds extra parameter
    yb_wqueue to two functions.  YB pg15 has several changes in the
    area.  Trivial resolution.
  - ATRewriteTables: trivial conflict between YB pg15 merge
    5627af5 and YB master
    d5d7363 regarding both sides adding
    params to function make_new_heap.
  - YbATSetPKRewriteChildPartitions: YB master
    d5d7363 adds heap_open/heap_close,
    but upstream PG switched to table_open/table_close.
- slot.c: conflict between YB pg15 merge
  417e9b3 and YB master
  e5dbd2e.  Resolution appears to be to
  change "slot" to "s" and take YB master's change.
- slotfuncs.c:
  - pg_create_logical_replication_slot: adjacent lines conflict between
    YB master e5dbd2e and YB pg15 merge
    ed96733.
  - pg_get_replication_slots: merge YB master
    e5dbd2e and YB pg15 merge
    ed96733.
- ipci.c: upstream PG 0bd305ee1d427ef29f5fa4fa20567e3b3f5ff792 moves
  code into CalculateShmemSize.  YB
  bda4da7 touches code there.  Move
  that code to the new location.
- pg_dump.c: YB master 5659b73 adds
  use_roles_sql logic while upstream PG changes the area.  Merge.
- event_trigger.h: YB master d5d7363
  adds YB_AT_REWRITE_ALTER_PRIMARY_KEY as 0x16 while upstream PG
  578b229718e8f15fa779e20f086c4b6bb3776106 and
  b0483263dda0824cc49e3f8a022dab07e1cdf9a7 modify definitions for 0x08.
  Trivial merge.
- pgstat.h/wait_event.h: YB master
  bda4da7 touches
  yb_pgstat_report_wait_start, but YB pg15 merge
  5627af5 moves the function to
  wait_event.h.  Apply the modification there, and also move the
  associated yb_ash.h include.
- pgstat.c/wait_event.c: YB master
  bda4da7 touches functions formerly in
  pgstat.c now moved to wait_event.c.  Apply the same changes to the new
  location using

      sed -i 's/YBEnableAsh()/yb_ash_enable_infra/' src/postgres/src/backend/utils/activity/wait_event.c

- guc.c: YB master bda4da7 adds ASH
  GUCs with config groups RESOURCES and STATS_COLLECTOR that upstream PG
  removes.  Choose config group STATS_MONITORING for both GUCs.
- pg15_tests/passing_tests.tsv:
  - remove org.yb.pgsql.TestPgAlterTableChangePrimaryKey from the list
    as that test was deleted by YB master
    d5d7363.
  - remove org.yb.pgsql.TestPgRegressReplicationSlot and
    CDCSDKYsqlTest.TestReplicationSlotDropWithActiveInvalid from the
    list because these tests start failing due to actual bugs.  The
    source of the bugs is unclear, but in order to push forward merge
    progress, leave deeper investigation for later.
- pg15_tests/test_regress_table.sh: YB master
  d5d7363 adds new regress test which
  suffers the same serial type bug as yb_alter_table.  Add it to the
  list of tests expected to fail.

Test Plan:
Almalinux 8:

    #!/usr/bin/env bash
    set -eu
    ./yb_build.sh fastdebug --gcc11
    pg15_tests/run_tests.sh

Almalinux 8, fastdebug, gcc11, get the following results:

    0	2024-04-21T10:58:52-07:00	pgwrapper_pg_get_lock_status-test	PgGetLockStatusTest.TestGetLockStatusLimitNumTxnLocks
    0	2024-04-21T10:59:04-07:00	pgwrapper_pg_wait_on_conflict-test	PgWaitQueuesTest.TestDDLsNotBlockedOnWaiters
    0	2024-04-21T10:59:21-07:00	integration-tests_cdcsdk_ysql-test	CDCSDKYsqlTest.TestCDCStateEntryForReplicationSlot
    0	2024-04-21T11:00:33-07:00	pgwrapper_pg_wrapper-test	PgWrapperFlagsTest.*
    1	2024-04-21T11:30:42-07:00	JAVA	org.yb.pgsql.TestPgEncryption#testSslWithAuth[0]
    1	2024-04-21T12:02:00-07:00	JAVA	org.yb.pgsql.TestPgEncryption
    0	2024-04-21T12:02:11-07:00	master_clone_state_manager-test	CloneStateManagerTest.*
    0	2024-04-21T12:02:25-07:00	pgwrapper_pg_fkey-test	PgFKeyTest.SameTableReference
    1	2024-04-21T12:03:27-07:00	JAVA	org.yb.pgsql.TestPgRegressParallel#testPgRegressParallel
    1	2024-04-21T12:05:19-07:00	JAVA	org.yb.pgsql.TestPgParallelReadIsolation
    0	2024-04-21T12:05:50-07:00	integration-tests_wait_states-itest	WaitStateITest.*
    0	2024-04-21T12:10:29-07:00	integration-tests_wait_states-itest	*/AshTestVerifyOccurrence.VerifyWaitStateEntered/*
    0	2024-04-21T12:11:20-07:00	JAVA	org.yb.pgsql.TestYbBackup#testBackupRestoreRoles
    0	2024-04-21T12:12:11-07:00	JAVA	org.yb.pgsql.TestYbBackup#testBackupRolesWithoutUseRoles
    0	2024-04-21T12:13:01-07:00	JAVA	org.yb.pgsql.TestYbBackup#testBackupRolesWithoutRestoreRoles
    1	2024-04-21T12:14:44-07:00	JAVA	org.yb.pgsql.TestYsqlDump#ysqlDumpWithYbMetadata
    1	2024-04-21T12:15:24-07:00	JAVA	org.yb.pgsql.TestYsqlDump#ysqlDumpWithoutYbMetadata
    1	2024-04-21T12:16:21-07:00	JAVA	org.yb.pgsql.TestYsqlDump#ysqlDumpAllWithoutYbMetadata
    1	2024-04-21T12:17:18-07:00	JAVA	org.yb.pgsql.TestYsqlDump#ysqlDumpAllWithYbMetadata
    1	2024-04-21T12:18:19-07:00	JAVA	org.yb.pgsql.TestYsqlDump#ysqlDumpLegacyColocatedDB
    1	2024-04-21T12:19:09-07:00	JAVA	org.yb.pgsql.TestYsqlDump#ysqlDumpColocatedDB
    1	2024-04-21T12:19:29-07:00	JAVA	org.yb.pgsql.TestYbBackup#testExtensionBackupUsingTestExtension
    0	2024-04-21T12:20:42-07:00	JAVA	org.yb.pgsql.TestYbBackup#testIgnoreExistingTablespaces
    0	2024-04-21T12:20:59-07:00	pgwrapper_pg_wait_on_conflict-test	PgWaitQueuesTest.TestMultipleRequestsPerTxn
    0	2024-04-21T12:21:36-07:00	integration-tests_cdcsdk_ysql-test	CDCSDKYsqlTest.TestAtomicDDLDropColumn
    0	2024-04-21T12:22:04-07:00	integration-tests_xcluster_ysql-test	XClusterYsqlTest.DmlOperationsBlockedOnStandbyCluster
    0	2024-04-21T12:27:51-07:00	integration-tests_wait_states-itest	*
    0	2024-04-21T12:28:19-07:00	pgwrapper_pg_ash-test	PgAshTest.*
    0	2024-04-21T12:39:33-07:00	integration-tests_cdcsdk_consumption_consistent_changes-test	CDCSDKConsumptionConsistentChangesTest.*
    1	2024-04-21T12:45:31-07:00	JAVA	org.yb.pgsql.TestPgRegressPlanner
    0	2024-04-21T12:45:46-07:00	client_snapshot-schedule-test	CloneFromScheduleTest.Clone
    0	2024-04-21T12:46:21-07:00	integration-tests_minicluster-snapshot-test	PgCloneTest.Clone
    8	2024-04-21T13:07:35-07:00	pgwrapper_pg_catalog_version-test	*

Almalinux 8, release, clang17, get the following results:

    0	2024-04-21T13:26:49-07:00	pgwrapper_pg_index_backfill-test	PgIndexBackfillTest.GinStress

Jenkins: rebase: pg15

Reviewers: aagrawal, tfoucher, xCluster, hsunder

Reviewed By: tfoucher

Subscribers: ybase, ycdcxcluster, yql

Differential Revision: https://phorge.dev.yugabyte.com/D34352
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cdcsdk CDC SDK jira-originated kind/bug This issue is a bug priority/high High Priority
Projects
None yet
Development

No branches or pull requests

2 participants