Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: fix race condition in debug signal on exit #3528

Merged
merged 1 commit into from
Oct 27, 2015

Conversation

bnoordhuis
Copy link
Member

Before this commit, sending a SIGUSR1 at program exit could trigger a
hard to reproduce race condition where v8::Debug::DebugBreak(isolate)
got called when the isolate was in the process of being torn down.

A similar race condition is in theory possible when sending signals
to two threads simultaneously but I haven't been able to reproduce
that myself (and I tried, oh how I tried.)

This commit fixes the race condition by turning node_isolate into
a std::atomic and using it as an ad hoc synchronization primitive
in places where that is necessary.

R=@indutny?

CI: https://ci.nodejs.org/job/node-test-pull-request/615/

@bnoordhuis bnoordhuis added c++ Issues and PRs that require attention from people who are familiar with C++. debugger labels Oct 26, 2015
static void TryStartDebugger() {
// Call only async signal-safe functions here! Don't retry the exchange,
// it will deadlock when the thread is interrupted inside a critical section.
if (auto isolate = node_isolate.exchange(nullptr)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, our linter doesn't complain about this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would it? It's a common C++ idiom.

@indutny
Copy link
Member

indutny commented Oct 26, 2015

LGTM

@jasnell
Copy link
Member

jasnell commented Oct 26, 2015

LGTM

@bnoordhuis
Copy link
Member Author

OS X... ../src/node.cc:89:10: fatal error: 'atomic' file not found. bangs head against table

@jasnell
Copy link
Member

jasnell commented Oct 26, 2015

@bnoordhuis .. ugh.

@bnoordhuis
Copy link
Member Author

Hacked together a crummy workaround. I'll refine it further but first let's see if the basic approach works: https://ci.nodejs.org/job/node-test-pull-request/619/

@bnoordhuis
Copy link
Member Author

@bnoordhuis
Copy link
Member Author

Looking healthy. @indutny @jasnell Can I get a LGTM for 1b0ffcc?

@jasnell
Copy link
Member

jasnell commented Oct 26, 2015

Having to polyfill atomic makes me sad. Otherwise LGTM.

template <typename T>
struct atomic {
atomic() = default;
T exchange(T value) { return __sync_lock_test_and_set(&value_, value); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice polyfill you have here.

@indutny
Copy link
Member

indutny commented Oct 26, 2015

LGTM, if it works

Before this commit, sending a SIGUSR1 at program exit could trigger a
hard to reproduce race condition where `v8::Debug::DebugBreak(isolate)`
got called when the isolate was in the process of being torn down.

A similar race condition is in theory possible when sending signals
to two threads simultaneously but I haven't been able to reproduce
that myself (and I tried, oh how I tried.)

This commit fixes the race condition by turning `node_isolate` into
a `std::atomic` and using it as an ad hoc synchronization primitive
in places where that is necessary.

A bare minimum std::atomic polyfill is added for OS X because Apple
wouldn't be Apple if things just worked out of the box.

PR-URL: nodejs#3528
Reviewed-By: Fedor Indutny <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@bnoordhuis bnoordhuis closed this Oct 27, 2015
@bnoordhuis bnoordhuis deleted the fix-debugger-race branch October 27, 2015 12:02
@bnoordhuis bnoordhuis merged commit 53e64bb into nodejs:master Oct 27, 2015
@thefourtheye
Copy link
Contributor

A bare minimum std::atomic polyfill is added for OS X because Apple
wouldn't be Apple if things just worked out of the box.

lol

@jasnell
Copy link
Member

jasnell commented Oct 28, 2015

@bnoordhuis ... is this something that should go into LTS or not?

bnoordhuis added a commit that referenced this pull request Oct 28, 2015
Before this commit, sending a SIGUSR1 at program exit could trigger a
hard to reproduce race condition where `v8::Debug::DebugBreak(isolate)`
got called when the isolate was in the process of being torn down.

A similar race condition is in theory possible when sending signals
to two threads simultaneously but I haven't been able to reproduce
that myself (and I tried, oh how I tried.)

This commit fixes the race condition by turning `node_isolate` into
a `std::atomic` and using it as an ad hoc synchronization primitive
in places where that is necessary.

A bare minimum std::atomic polyfill is added for OS X because Apple
wouldn't be Apple if things just worked out of the box.

PR-URL: #3528
Reviewed-By: Fedor Indutny <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@jasnell
Copy link
Member

jasnell commented Oct 28, 2015

Landed in v4.x-staging in c85736e

rvagg pushed a commit to rvagg/io.js that referenced this pull request Oct 29, 2015
Before this commit, sending a SIGUSR1 at program exit could trigger a
hard to reproduce race condition where `v8::Debug::DebugBreak(isolate)`
got called when the isolate was in the process of being torn down.

A similar race condition is in theory possible when sending signals
to two threads simultaneously but I haven't been able to reproduce
that myself (and I tried, oh how I tried.)

This commit fixes the race condition by turning `node_isolate` into
a `std::atomic` and using it as an ad hoc synchronization primitive
in places where that is necessary.

A bare minimum std::atomic polyfill is added for OS X because Apple
wouldn't be Apple if things just worked out of the box.

PR-URL: nodejs#3528
Reviewed-By: Fedor Indutny <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@rvagg rvagg mentioned this pull request Oct 29, 2015
bnoordhuis added a commit that referenced this pull request Oct 29, 2015
Before this commit, sending a SIGUSR1 at program exit could trigger a
hard to reproduce race condition where `v8::Debug::DebugBreak(isolate)`
got called when the isolate was in the process of being torn down.

A similar race condition is in theory possible when sending signals
to two threads simultaneously but I haven't been able to reproduce
that myself (and I tried, oh how I tried.)

This commit fixes the race condition by turning `node_isolate` into
a `std::atomic` and using it as an ad hoc synchronization primitive
in places where that is necessary.

A bare minimum std::atomic polyfill is added for OS X because Apple
wouldn't be Apple if things just worked out of the box.

PR-URL: #3528
Reviewed-By: Fedor Indutny <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants