Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] When dumping thread stacks, retry if the thread is in memory allocation/deallocation #17889

Open
1 task done
mbautin opened this issue Jun 21, 2023 · 0 comments
Open
1 task done
Assignees
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@mbautin
Copy link
Contributor

mbautin commented Jun 21, 2023

Jira Link: DB-6969

Description

When dumping thread stacks, retry if the thread is in memory allocation/deallocation. We are not able to capture thread stacks in a signal handler if the thread is doing allocation or deallocation using Google tcmalloc.

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@mbautin mbautin added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Jun 21, 2023
@mbautin mbautin self-assigned this Jun 21, 2023
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jun 21, 2023
mbautin added a commit that referenced this issue Jun 23, 2023
Summary:
When trying to capture a stack trace with a signal handler, if a memory allocation/deallocation is happening in the thread receiving the signal, the process could crash. Google TCMalloc issue: google/tcmalloc#189.

In this diff, we are using the IsCurThreadInAllocDealloc malloc extension API we added in yugabyte/tcmalloc@677ba2d to skip capturing the stack trace in case the signal interrupted a thread that is currently allocating or deallocating memory. In such cases, we produce an empty stack trace which is later omitted from the overall threads dump. #17889 is a follow-up issue for retrying obtaining stack traces in such cases.

Another change contained in the TCMalloc version that we are upgrading to is yugabyte/tcmalloc@d1b0e69 (adding an option to not seed lifetime profiler with live allocations). We are now setting seed_with_live_allocs to false when capturing an allocation profile.

Test Plan: Jenkins

Reviewers: asrivastava

Reviewed By: asrivastava

Subscribers: ybase, bogdan

Differential Revision: https://phorge.dev.yugabyte.com/D26349
@yugabyte-ci yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Jun 23, 2023
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature and removed kind/bug This issue is a bug labels Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

2 participants