Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] AbortTransaction PG core dump occurs in various scenarios #17172

Closed
1 task done
qvad opened this issue May 4, 2023 · 6 comments
Closed
1 task done

[YSQL] AbortTransaction PG core dump occurs in various scenarios #17172

qvad opened this issue May 4, 2023 · 6 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/high High Priority qa_automation Bugs identified via itest-system, LST, Stress automation or causing automation failures QA_Blocker qa_stress Bugs identified via Stress automation

Comments

@qvad
Copy link
Contributor

qvad commented May 4, 2023

Jira Link: DB-6447

Description

We observe this issue in various scenarios in stress suites already.

(lldb) target create "/home/yugabyte/yb-software/yugabyte-2.19.0.0-b114-centos-x86_64/postgres/bin/postgres" --core "/home/yugabyte/cores/core_14000_1682836085_!home!yugabyte!yb-software!yugabyte-2.19.0.0-b114-centos-x86_64!postgres!bin!postgres"
Core file '/home/yugabyte/cores/core_14000_1682836085_!home!yugabyte!yb-software!yugabyte-2.19.0.0-b114-centos-x86_64!postgres!bin!postgres' (x86_64) was loaded.
(lldb) bt all
warning: This version of LLDB has no plugin for the language "assembler". Inspection of frame variables will be limited.
* thread #1, name = 'postgres', stop reason = signal SIGABRT
  * frame #0: 0x00007f66a1c310a7 libc.so.6`__GI_raise(sig=6) at raise.c:54
    frame #1: 0x00007f66a1c324aa libc.so.6`__GI_abort at abort.c:89
    frame #2: 0x000055e9f60772ba postgres`errfinish(dummy=<unavailable>) at elog.c:815:3
    frame #3: 0x000055e9f607f729 postgres`elog_start(filename="", lineno=2744, funcname=<unavailable>) at elog.c:1698:3
    frame #4: 0x000055e9f5a866c3 postgres`AbortTransaction at xact.c:2743:3
    frame #5: 0x000055e9f5a89b2a postgres`AbortCurrentTransaction at xact.c:0:4
    frame #6: 0x000055e9f5ecca8e postgres`PostgresMain(argc=<unavailable>, argv=<unavailable>, dbname=<unavailable>, username=<unavailable>) at postgres.c:5024:3
    frame #7: 0x000055e9f5e0f5de postgres`BackendRun(port=0x000055e9f91121e0) at postmaster.c:4676:2
    frame #8: 0x000055e9f5e0e6a0 postgres`ServerLoop [inlined] BackendStartup(port=0x000055e9f91121e0) at postmaster.c:4314:3
    frame #9: 0x000055e9f5e0e61a postgres`ServerLoop at postmaster.c:1774:7
    frame #10: 0x000055e9f5e09b15 postgres`PostmasterMain(argc=23, argv=0x000055e9f91283c0) at postmaster.c:1430:11
    frame #11: 0x000055e9f5d0ef0f postgres`PostgresServerProcessMain(argc=23, argv=0x000055e9f91283c0) at main.c:234:3
    frame #12: 0x000055e9f59d60b2 postgres`main + 34
    frame #13: 0x00007f66a1c1e825 libc.so.6`__libc_start_main(main=(postgres`main), argc=23, argv=0x00007ffe00acad88, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007ffe00acad78) at libc-start.c:289
    frame #14: 0x000055e9f59d5fc9 postgres`_start at start.S:108
  thread #2, stop reason = signal 0
    frame #0: 0x00007f66a25ac3b8 libpthread.so.0`pthread_cond_timedwait@@GLIBC_2.3.2 at pthread_cond_timedwait.S:225
    frame #1: 0x00007f66a288e29b libc++.so.1`std::__1::condition_variable::__do_timed_wait(std::__1::unique_lock<std::__1::mutex>&, std::__1::chrono::time_point<std::__1::chrono::system_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>) + 91
    frame #2: 0x00007f669df1f840 libyb_util.so`yb::(anonymous namespace)::LongOperationTrackerHelper::Execute() [inlined] std::__1::cv_status std::__1::condition_variable::wait_for<long long, std::__1::ratio<1l, 1000000000l>>(this=0x00007f669e0386f8, __lk=0x00007f6690015298, __d=<unavailable>) at __mutex_base:0:72
    frame #3: 0x00007f669df1f7b9 libyb_util.so`yb::(anonymous namespace)::LongOperationTrackerHelper::Execute(this=0x00007f669e0386b0) at long_operation_tracker.cc:111:19
    frame #4: 0x00007f669e00440c libyb_util.so`yb::Thread::SuperviseThread(void*) [inlined] std::__1::__function::__value_func<void ()>::operator(this=0x000055e9f911f5c0)[abi:v15007]() const at function.h:512:16
    frame #5: 0x00007f669e0043f6 libyb_util.so`yb::Thread::SuperviseThread(void*) [inlined] std::__1::function<void ()>::operator(this=0x000055e9f911f5c0)() const at function.h:1197:12
    frame #6: 0x00007f669e0043f6 libyb_util.so`yb::Thread::SuperviseThread(arg=0x000055e9f911f560) at thread.cc:842:3
    frame #7: 0x00007f66a25a7694 libpthread.so.0`start_thread(arg=0x00007f669001d700) at pthread_create.c:333
    frame #8: 0x00007f66a1ce441d libc.so.6`__clone at clone.S:109

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@qvad qvad added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels May 4, 2023
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels May 4, 2023
@yugabyte-ci yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label May 24, 2023
@yugabyte-ci yugabyte-ci assigned tverona1 and unassigned m-iancu May 24, 2023
@Karvy-yb
Copy link

Observing this issue more frequently on master runs (2.19.1) for different testcases :

  • test_intensive_multi_tenancy_workload for version 2.19.1.0-b363
  • test_ysql_bank_operations_pessimistic_lock for version 2.19.1.0-b363
  • test_create_alter_delete_tables_vm_restarts on versions 2.19.1.0-b379 and 2.19.1.0-b389

cc: @robertsami

@shamanthchandra-yb
Copy link

Observed this in my packed toggle off and on case as well.

test_sql_packed_columns_toggle_on_and_off

@Karvy-yb
Copy link

Karvy-yb commented Sep 4, 2023

Observing this issue with other workloads as well.
Recent failure on 2.19.3.0-b53 on test_intensive_multi_tenancy_workload testcase which runs SqlIntensiveConsistencyDDL workload

@Karvy-yb
Copy link

Karvy-yb commented Oct 5, 2023

Observing this coredump for test_create_alter_delete_tables_vm_restarts testcase on 2.14.14, 2.16.8, 2.18.4 and 2.20.0 runs now. Failing consistently on all runs with this.

@qvad qvad changed the title [YSQL] PG core dump occurs in bank workload test with PITR [YSQL] AbortTransaction PG core dump occurs in various scenarios Oct 10, 2023
@sushantrmishra
Copy link

Possibly duplicate of #18192

@Arjun-yb
Copy link
Contributor

Observed this issue in 2.14.16.0-b3 as well, test: test_ysql_tablet_split_ps_restarts

@pilshchikov pilshchikov added the qa_stress Bugs identified via Stress automation label Feb 11, 2024
@pilshchikov pilshchikov added the qa_automation Bugs identified via itest-system, LST, Stress automation or causing automation failures label Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/high High Priority qa_automation Bugs identified via itest-system, LST, Stress automation or causing automation failures QA_Blocker qa_stress Bugs identified via Stress automation
Projects
None yet
Development

No branches or pull requests

9 participants