Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] It is not safe to call YBOnPostgresBackendShutdown from quickdie #12884

Closed
d-uspenskiy opened this issue Jun 14, 2022 · 1 comment
Closed
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@d-uspenskiy
Copy link
Contributor

d-uspenskiy commented Jun 14, 2022

Jira Link: DB-2635

Description

The quickdie function is called as a signal handler. As a result this function should call async-signal-safe functions only.
The YBOnPostgresBackendShutdown function is too complex to be called from signal handler as this function destroys the pggate::PgApiImpl object and the destructor of this object locks mutexes.

Also it is not safe to destroy the pggate::PgApiImpl object because YSQL is a multi-threaded process. As a result quickdie can be called from any of the thread. As a result pggate::PgApiImpl object can be destroyed while it is still in use by the YSQL main thread.

Example:

YSQL's main thread calls YBCInsertSequenceTuple function

...
if (IsYugaByteEnabled())
{
    HandleYBStatus(YBCInsertSequenceTuple(...)); // This function uses the `pggate::PgApiImpl` object
}
...

at same time another thread handles SIGQUIT signal by calling quickdie.

As a result it is possible that quickdie will delete the pggate::PgApiImpl object while it is still in use.

@d-uspenskiy d-uspenskiy added kind/bug This issue is a bug area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Jun 14, 2022
@yugabyte-ci yugabyte-ci added priority/medium Medium priority issue and removed status/awaiting-triage Issue awaiting triage labels Jun 14, 2022
@mbautin
Copy link
Contributor

mbautin commented Jun 14, 2022

Can we set a flag or enqueue a task in the signal handler that will then cause shutdown to happen in a regular thread?

deeps1991 added a commit that referenced this issue Feb 21, 2023
Summary:
quickdie() is the signal handler invoked to handle SIGQUIT. This is the
immediate shutdown mode, expected to be invoked when something is really
wrong, such as corrupted shared memory.

This being the case, it is not safe to invoke
YBOnPostgresBackendShutdown, which internally invokes PgApi destructor
from quickdie. This also causes TSAN failures because there could be
other concurrent threads in process of accessing the PgApi object.

Secondly as noted in the comment in quickdie(), it is not safe to call
ereport() from quickdie as it is not async-signal-safe. It has been
added by PG authors anyway as a best effort to provide some information before shutting
down. Although acceptable, this again causes TSAN failures. Therefore
this patch silences the TSAN failures by not invoking the ereport() in
TSAN build.

Test Plan: Manually tested by running tests enabled with TSAN.

Reviewers: dmitry

Reviewed By: dmitry

Subscribers: yql

Differential Revision: https://phabricator.dev.yugabyte.com/D23045
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

6 participants