Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PITR] [Segmentation fault] tserver went down after snapshot created with core dumps #24084

Closed
1 task done
agsh-yb opened this issue Sep 22, 2024 · 1 comment
Closed
1 task done
Assignees
Labels
2024.2_blocker area/docdb YugabyteDB core features kind/bug This issue is a bug pitr priority/medium Medium priority issue qa_stress Bugs identified via Stress automation QA QA filed bugs

Comments

@agsh-yb
Copy link
Contributor

agsh-yb commented Sep 22, 2024

Jira Link: DB-12978

Description

On version: 2.23.1.0-b206
Found these cores and tserver went down
With nemesis, These cores are observed in subsequent runs
Snapshot case

Logs in jira

(lldb) target create "/home/yugabyte/tserver/bin/yb-tserver" --core "/home/yugabyte/cores/core_1373_1726629700_!home!yugabyte!yb-software!yugabyte-2.23.1.0-b206-almalinux8-aarch64!bin!yb-server"
Core file '/home/yugabyte/cores/core_1373_1726629700_!home!yugabyte!yb-software!yugabyte-2.23.1.0-b206-almalinux8-aarch64!bin!yb-server' (aarch64) was loaded.
(lldb) bt all
* thread #1, name = 'yb-tserver', stop reason = signal SIGSEGV
  * frame #0: 0x0000aaaac7369ffc yb-tserver`yb::ScopedRWOperation::ScopedRWOperation(yb::RWOperationCounter*, yb::StatusHolder const*, std::__1::chrono::time_point<yb::CoarseMonoClock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>> const&) [inlined] std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::__is_long[abi:ue170006](this=<unavailable>) const at string:1734:33
    frame #1: 0x0000aaaac7369ffc yb-tserver`yb::ScopedRWOperation::ScopedRWOperation(yb::RWOperationCounter*, yb::StatusHolder const*, std::__1::chrono::time_point<yb::CoarseMonoClock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>> const&) [inlined] std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::basic_string(this="", __str=<unavailable>) at string:898:16
    frame #2: 0x0000aaaac7369ffc yb-tserver`yb::ScopedRWOperation::ScopedRWOperation(yb::RWOperationCounter*, yb::StatusHolder const*, std::__1::chrono::time_point<yb::CoarseMonoClock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>> const&) [inlined] yb::RWOperationCounter::resource_name(this=0x0000000000000388) const at operation_counter.h:95:12
    frame #3: 0x0000aaaac7369ffc yb-tserver`yb::ScopedRWOperation::ScopedRWOperation(this=0x0000ffff71714848, counter=0x0000000000000388, abort_status_holder=0x0000000000000000, deadline=0x0000ffff71714888) at operation_counter.cc:190:62
    frame #4: 0x0000aaaac6c5060c yb-tserver`yb::tablet::Tablet::MaxPersistentOpId(bool) const [inlined] yb::ScopedRWOperation::ScopedRWOperation(this=0x0000ffff71714848, counter=<unavailable>, deadline=0x0000ffff71714888) at operation_counter.h:140:9
    frame #5: 0x0000aaaac6c505fc yb-tserver`yb::tablet::Tablet::MaxPersistentOpId(bool) const [inlined] yb::tablet::Tablet::CreateScopedRWOperationBlockingRocksDbShutdownStart(this=0x0000000000000000, deadline=yb::CoarseTimePoint @ 0x0000ffff71714888) const at tablet.cc:3375:10
    frame #6: 0x0000aaaac6c505f8 yb-tserver`yb::tablet::Tablet::MaxPersistentOpId(this=0x0000000000000000, invalid_if_no_new_data=false) const at tablet.cc:3540:32
    frame #7: 0x0000aaaac6c7c458 yb-tserver`yb::tablet::TabletPeer::MaxPersistentOpId(this=<unavailable>) const at tablet_peer.cc:946:23
    frame #8: 0x0000aaaac6cdd548 yb-tserver`yb::tablet::TransactionParticipant::Impl::DoProcessRecentlyAppliedTransactions(this=0x000037753e138a00, retryable_requests_flushed_op_id=0x000037753e138d00, persist=<unavailable>) at transaction_participant.cc:2186:22
    frame #9: 0x0000aaaac6cdf3b0 yb-tserver`yb::tablet::TransactionParticipant::ProcessRecentlyAppliedTransactions() [inlined] yb::tablet::TransactionParticipant::Impl::ProcessRecentlyAppliedTransactions(this=<unavailable>) at transaction_participant.cc:1440:27
    frame #10: 0x0000aaaac6cdf384 yb-tserver`yb::tablet::TransactionParticipant::ProcessRecentlyAppliedTransactions(this=<unavailable>) at transaction_participant.cc:2629:17
    frame #11: 0x0000aaaac6c315c4 yb-tserver`yb::tablet::Tablet::RocksDbListener::OnFlushCompleted(this=0x000037753e4c1a98, (null)=<unavailable>, (null)=<unavailable>) at tablet.cc:503:34
    frame #12: 0x0000aaaac69603b0 yb-tserver`rocksdb::DBImpl::BackgroundCallFlush(rocksdb::ColumnFamilyData*) at db_impl.cc:2121:19
    frame #13: 0x0000aaaac69601b8 yb-tserver`rocksdb::DBImpl::BackgroundCallFlush(rocksdb::ColumnFamilyData*) [inlined] rocksdb::DBImpl::FlushMemTableToOutputFile(this=0x000037753e139480, cfd=<unavailable>, mutable_cf_options=0x0000ffff717155b0, made_progress=<unavailable>, job_context=0x0000ffff717153e8, log_buffer=0x0000ffff71714af8) at db_impl.cc:2008:3
    frame #14: 0x0000aaaac695fde0 yb-tserver`rocksdb::DBImpl::BackgroundCallFlush(rocksdb::ColumnFamilyData*) [inlined] rocksdb::DBImpl::BackgroundFlush(this=0x000037753e139480, made_progress=<unavailable>, job_context=0x0000ffff717153e8, log_buffer=0x0000ffff71714af8, cfd=<unavailable>) at db_impl.cc:3399:10
    frame #15: 0x0000aaaac695fde0 yb-tserver`rocksdb::DBImpl::BackgroundCallFlush(this=<unavailable>, cfd=<unavailable>) at db_impl.cc:3470:31
    frame #16: 0x0000aaaac694d684 yb-tserver`rocksdb::DBImpl::BGWorkFlush(db=<unavailable>) at db_impl.cc:3319:34 [artificial]
    frame #17: 0x0000aaaac6a65040 yb-tserver`std::__1::__function::__func<rocksdb::ThreadPool::StartBGThreads()::$_0, std::__1::allocator<rocksdb::ThreadPool::StartBGThreads()::$_0>, void ()>::operator()() at thread_posix.cc:133:5
    frame #18: 0x0000aaaac6a64f24 yb-tserver`std::__1::__function::__func<rocksdb::ThreadPool::StartBGThreads()::$_0, std::__1::allocator<rocksdb::ThreadPool::StartBGThreads()::$_0>, void ()>::operator()() [inlined] rocksdb::ThreadPool::StartBGThreads()::$_0::operator()(this=<unavailable>) const at thread_posix.cc:172:5
    frame #19: 0x0000aaaac6a64f24 yb-tserver`std::__1::__function::__func<rocksdb::ThreadPool::StartBGThreads()::$_0, std::__1::allocator<rocksdb::ThreadPool::StartBGThreads()::$_0>, void ()>::operator()() [inlined] decltype(std::declval<rocksdb::ThreadPool::StartBGThreads()::$_0&>()()) std::__1::__invoke[abi:ue170006]<rocksdb::ThreadPool::StartBGThreads()::$_0&>(__f=<unavailable>) at invoke.h:340:25
    frame #20: 0x0000aaaac6a64f24 yb-tserver`std::__1::__function::__func<rocksdb::ThreadPool::StartBGThreads()::$_0, std::__1::allocator<rocksdb::ThreadPool::StartBGThreads()::$_0>, void ()>::operator()() [inlined] void std::__1::__invoke_void_return_wrapper<void, true>::__call[abi:ue170006]<rocksdb::ThreadPool::StartBGThreads()::$_0&>(__args=<unavailable>) at invoke.h:415:5
    frame #21: 0x0000aaaac6a64f24 yb-tserver`std::__1::__function::__func<rocksdb::ThreadPool::StartBGThreads()::$_0, std::__1::allocator<rocksdb::ThreadPool::StartBGThreads()::$_0>, void ()>::operator()() [inlined] std::__1::__function::__alloc_func<rocksdb::ThreadPool::StartBGThreads()::$_0, std::__1::allocator<rocksdb::ThreadPool::StartBGThreads()::$_0>, void ()>::operator()[abi:ue170006](this=<unavailable>) at function.h:192:16
    frame #22: 0x0000aaaac6a64f24 yb-tserver`std::__1::__function::__func<rocksdb::ThreadPool::StartBGThreads()::$_0, std::__1::allocator<rocksdb::ThreadPool::StartBGThreads()::$_0>, void ()>::operator()(this=<unavailable>) at function.h:363:12
    frame #23: 0x0000aaaac73b5838 yb-tserver`yb::Thread::SuperviseThread(void*) [inlined] std::__1::__function::__value_func<void ()>::operator()[abi:ue170006](this=0x000037753eb635c0) const at function.h:517:16
    frame #24: 0x0000aaaac73b5824 yb-tserver`yb::Thread::SuperviseThread(void*) [inlined] std::__1::function<void ()>::operator()(this=0x000037753eb635c0) const at function.h:1168:12
    frame #25: 0x0000aaaac73b5824 yb-tserver`yb::Thread::SuperviseThread(arg=0x000037753eb63560) at thread.cc:866:3
    frame #26: 0x0000ffffa37c78b8 libpthread.so.0`start_thread + 392
    frame #27: 0x0000ffffa3823afc libc.so.6`thread_start + 12
  thread #2, stop reason = signal 0
    frame #0: 0x0000ffffa37cdc60 libpthread.so.0`pthread_cond_wait@@GLIBC_2.17 + 528
    frame #1: 0x0000aaaac73b8ce4 yb-tserver`yb::ThreadPool::DispatchThread(bool) [inlined] yb::ConditionVariable::Wait(this=0x000037753f848ed8) const at condition_variable.cc:80:12
    frame #2: 0x0000aaaac73b8cd8 yb-tserver`yb::ThreadPool::DispatchThread(this=0x000037753f848e00, permanent=true) at threadpool.cc:561:20
    frame #3: 0x0000aaaac73b5838 yb-tserver`yb::Thread::SuperviseThread(void*) [inlined] std::__1::__function::__value_func<void ()>::operator()[abi:ue170006](this=0x000037753fdcb5c0) const at function.h:517:16
    frame #4: 0x0000aaaac73b5824 yb-tserver`yb::Thread::SuperviseThread(void*) [inlined] std::__1::function<void ()>::operator()(this=0x000037753fdcb5c0) const at function.h:1168:12
    frame #5: 0x0000aaaac73b5824 yb-tserver`yb::Thread::SuperviseThread(arg=0x000037753fdcb560) at thread.cc:866:3
    frame #6: 0x0000ffffa37c78b8 libpthread.so.0`start_thread + 392
    frame #7: 0x0000ffffa3823afc libc.so.6`thread_start + 12

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@agsh-yb agsh-yb added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Sep 22, 2024
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Sep 22, 2024
@agsh-yb agsh-yb added QA QA filed bugs pitr qa_stress Bugs identified via Stress automation labels Sep 22, 2024
@rthallamko3
Copy link
Contributor

DUP of #24026

@rthallamko3 rthallamko3 removed the status/awaiting-triage Issue awaiting triage label Sep 23, 2024
@rthallamko3 rthallamko3 assigned es1024 and unassigned rthallamko3 Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2024.2_blocker area/docdb YugabyteDB core features kind/bug This issue is a bug pitr priority/medium Medium priority issue qa_stress Bugs identified via Stress automation QA QA filed bugs
Projects
None yet
Development

No branches or pull requests

4 participants