Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] yb-master crashes due to narrow cast during ListTabletServers #21096

Closed
1 task done
lingamsandeep opened this issue Feb 16, 2024 · 0 comments
Closed
1 task done
Assignees
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@lingamsandeep
Copy link
Contributor

lingamsandeep commented Feb 16, 2024

Jira Link: DB-10056

Description

07T16_26_17.pid1840.txt
F20240207 16:26:17 ../../src/yb/gutil/casts.cc:21] Bad narrow cast: 2205994749 > 2147483647
@ 0x55e4d34d1257 google::LogMessage::SendToLog()
@ 0x55e4d34d219d google::LogMessage::Flush()
@ 0x55e4d34d2819 google::LogMessageFatal::~LogMessageFatal()
@ 0x55e4d3cd4c69 yb::BadNarrowCast()
@ 0x55e4d3741898 yb::narrow_cast<>()
@ 0x55e4d3f163aa yb::master::(anonymous namespace)::MasterClusterServiceImpl::ListTabletServers()
@ 0x55e4d414a455 std::__1::__function::__func<>::operator()()
@ 0x55e4d414b33f yb::master::MasterClusterIf::Handle()
@ 0x55e4d44aaeda yb::rpc::ServicePoolImpl::Handle()
@ 0x55e4d43ea97f yb::rpc::InboundCall::InboundCallTask::Run()
@ 0x55e4d44b9a73 yb::rpc::(anonymous namespace)::Worker::Execute()
@ 0x55e4d4b8ab02 yb::Thread::SuperviseThread()
@ 0x7f8ebad27694 start_thread
@ 0x7f8ebb22941d __clone

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@lingamsandeep lingamsandeep added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Feb 16, 2024
@lingamsandeep lingamsandeep self-assigned this Feb 16, 2024
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue and removed status/awaiting-triage Issue awaiting triage labels Feb 16, 2024
lingamsandeep added a commit that referenced this issue Feb 22, 2024
Summary:
During ListTabletServers, we occasionally experience a FATAL with the following stack whenever the last heartbeat was > 24 days ago. While this is a remote possibility, it is still a possibility.

So as part of ListTabletServers, if the last heartbeat is more than int32 max milliseconds, we just set it to int32 max.

```
F20240207 16:26:17 ../../src/yb/gutil/casts.cc:21] Bad narrow cast: 2205994749 > 2147483647
@ 0x55e4d34d1257 google::LogMessage::SendToLog()
@ 0x55e4d34d219d google::LogMessage::Flush()
@ 0x55e4d34d2819 google::LogMessageFatal::~LogMessageFatal()
@ 0x55e4d3cd4c69 yb::BadNarrowCast()
@ 0x55e4d3741898 yb::narrow_cast<>()
@ 0x55e4d3f163aa yb::master::(anonymous namespace)::MasterClusterServiceImpl::ListTabletServers()
@ 0x55e4d414a455 std::__1::__function::__func<>::operator()()
@ 0x55e4d414b33f yb::master::MasterClusterIf::Handle()
@ 0x55e4d44aaeda yb::rpc::ServicePoolImpl::Handle()
@ 0x55e4d43ea97f yb::rpc::InboundCall::InboundCallTask::Run()
@ 0x55e4d44b9a73 yb::rpc::(anonymous namespace)::Worker::Execute()
@ 0x55e4d4b8ab02 yb::Thread::SuperviseThread()
@ 0x7f8ebad27694 start_thread
@ 0x7f8ebb22941d __clone
```
Jira: DB-10056

Test Plan: MasterTest.TestRegisterAndHeartbeat

Reviewers: bkolagani, arybochkin

Reviewed By: bkolagani

Subscribers: ybase, bogdan

Differential Revision: https://phorge.dev.yugabyte.com/D32496
lingamsandeep added a commit that referenced this issue Feb 23, 2024
…ListTabletServers

Summary:
Original commit: aa2efd7 / D32496
During ListTabletServers, we occasionally experience a FATAL with the following stack whenever the last heartbeat was > 24 days ago. While this is a remote possibility, it is still a possibility.

So as part of ListTabletServers, if the last heartbeat is more than int32 max milliseconds, we just set it to int32 max.

```
F20240207 16:26:17 ../../src/yb/gutil/casts.cc:21] Bad narrow cast: 2205994749 > 2147483647
@ 0x55e4d34d1257 google::LogMessage::SendToLog()
@ 0x55e4d34d219d google::LogMessage::Flush()
@ 0x55e4d34d2819 google::LogMessageFatal::~LogMessageFatal()
@ 0x55e4d3cd4c69 yb::BadNarrowCast()
@ 0x55e4d3741898 yb::narrow_cast<>()
@ 0x55e4d3f163aa yb::master::(anonymous namespace)::MasterClusterServiceImpl::ListTabletServers()
@ 0x55e4d414a455 std::__1::__function::__func<>::operator()()
@ 0x55e4d414b33f yb::master::MasterClusterIf::Handle()
@ 0x55e4d44aaeda yb::rpc::ServicePoolImpl::Handle()
@ 0x55e4d43ea97f yb::rpc::InboundCall::InboundCallTask::Run()
@ 0x55e4d44b9a73 yb::rpc::(anonymous namespace)::Worker::Execute()
@ 0x55e4d4b8ab02 yb::Thread::SuperviseThread()
@ 0x7f8ebad27694 start_thread
@ 0x7f8ebb22941d __clone
```
Jira: DB-10056

Test Plan: MasterTest.TestRegisterAndHeartbeat

Reviewers: bkolagani, arybochkin

Reviewed By: bkolagani

Subscribers: bogdan, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D32600
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

3 participants