Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a fast path for std::thread::panicking. #72617

Merged
merged 2 commits into from
Jun 26, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 60 additions & 13 deletions src/libstd/panicking.rs
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ pub fn take_hook() -> Box<dyn Fn(&PanicInfo<'_>) + 'static + Sync + Send> {
fn default_hook(info: &PanicInfo<'_>) {
// If this is a double panic, make sure that we print a backtrace
// for this panic. Otherwise only print it if logging is enabled.
let backtrace_env = if update_panic_count(0) >= 2 {
let backtrace_env = if panic_count::get() >= 2 {
RustBacktrace::Print(backtrace_rs::PrintFmt::Full)
} else {
backtrace::rust_backtrace_env()
Expand Down Expand Up @@ -222,19 +222,65 @@ fn default_hook(info: &PanicInfo<'_>) {
#[cfg(not(test))]
#[doc(hidden)]
#[unstable(feature = "update_panic_count", issue = "none")]
pub fn update_panic_count(amt: isize) -> usize {
pub mod panic_count {
use crate::cell::Cell;
thread_local! { static PANIC_COUNT: Cell<usize> = Cell::new(0) }
use crate::sync::atomic::{AtomicUsize, Ordering};

// Panic count for the current thread.
thread_local! { static LOCAL_PANIC_COUNT: Cell<usize> = Cell::new(0) }

// Sum of panic counts from all threads. The purpose of this is to have
// a fast path in `is_zero` (which is used by `panicking`). Access to
// this variable can be always be done with relaxed ordering because
// it is always guaranteed that, if `GLOBAL_PANIC_COUNT` is zero,
// `LOCAL_PANIC_COUNT` will be zero.
Copy link
Member

@RalfJung RalfJung Jul 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if GLOBAL_PANIC_COUNT is zero, LOCAL_PANIC_COUNT will be zero.

This is a very strong statement to make, in particular about relaxed atomics! It took me some time to actually follow the argument here. You cannot, in general, relate the content of two different locations when you use relaxed accesses. In particular, "GLOBAL_PANIC_COUNT is zero" is an odd statement as atomics can appear to have different values in different threads. The only reason it works here is that one of the two is thread-local. I was also confused at first because decrease decrements GLOBAL first, so there is a time when GLOBAL is already zero for the current thread but LOCAL is still 1.

So what about

In any particular thread, if that thread currently views GLOBAL_PANIC_COUNT as being zero, then LOCAL_PANIC_COUNT in that thread is zero. This invariant holds before and after increase and decrease, but not necessarily during their execution.

Also, it would be good to explain in a comment why accessing an atomic global (potentially subject to contention) is faster than accessing a non-atomic thread local. That is rather non-intuitive IMO -- I would have expected the exact opposite.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting a TLS pointer requires calling __tls_get_addr for ELF binaries when using the GD TLS model. This is not inlinable and may walk a linked list.

https://github.com/bminor/glibc/blob/5f72f9800b250410cad3abfeeb09469ef12b2438/elf/dl-tls.c#L821

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any particular thread, if that thread currently views GLOBAL_PANIC_COUNT as being zero, then LOCAL_PANIC_COUNT in that thread is zero. This invariant holds before and after increase and decrease, but not necessarily during their execution.

Yes, that would have been a better description. I wrote the comment thinking about the is_zero function, which is the only place where the value of GLOBAL_PANIC_COUNT matters.

Also, it would be good to explain in a comment why accessing an atomic global (potentially subject to contention) is faster than accessing a non-atomic thread local. That is rather non-intuitive IMO -- I would have expected the exact opposite.

You can compare here the assembly of a relaxed atomic load and a TLS read. The atomic load just needs a normal mov.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can compare here the assembly of a relaxed atomic load and a TLS read. The atomic load just needs a normal mov.

Assembly is basically gibberish to me.^^ But that's okay, a high-level explanation like @bjorn3 gave is totally sufficient. :)

So could you open a PR to include these comments, or should I do that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So could you open a PR to include these comments, or should I do that?

I'll do it now.

static GLOBAL_PANIC_COUNT: AtomicUsize = AtomicUsize::new(0);

pub fn increase() -> usize {
GLOBAL_PANIC_COUNT.fetch_add(1, Ordering::Relaxed);
LOCAL_PANIC_COUNT.with(|c| {
let next = c.get() + 1;
c.set(next);
next
})
}

pub fn decrease() -> usize {
GLOBAL_PANIC_COUNT.fetch_sub(1, Ordering::Relaxed);
LOCAL_PANIC_COUNT.with(|c| {
let next = c.get() - 1;
c.set(next);
next
})
Comment on lines +249 to +254
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GLOBAL_PANIC_COUNT.fetch_sub(1, Ordering::Relaxed);
LOCAL_PANIC_COUNT.with(|c| {
let next = c.get() - 1;
c.set(next);
next
})
let next = LOCAL_PANIC_COUNT.with(|c| {
let next = c.get() - 1;
c.set(next);
next
});
GLOBAL_PANIC_COUNT.fetch_sub(1, Ordering::Relaxed);
next

LocalKey::with can itself panic:

pub fn with<F, R>(&'static self, f: F) -> R
where
F: FnOnce(&T) -> R,
{
self.try_with(f).expect(
"cannot access a Thread Local Storage value \
during or after destruction",
)
}

This will likely result in a double panic and thus process abort, but still.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK LocalKey::with can only panic if the type has a destructor (when accessing a TLS value that has already been dropped). This isn't the case here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like it is worth a comment in the code.

}

PANIC_COUNT.with(|c| {
let next = (c.get() as isize + amt) as usize;
c.set(next);
next
})
pub fn get() -> usize {
LOCAL_PANIC_COUNT.with(|c| c.get())
}

#[inline]
pub fn is_zero() -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should be #[inline], and a separate #[cold] function should be used for the slow path that is not inlined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the std::panicking::panicking function be also made #[inline]? Otherwise I'm not sure there will be much benefit.

The current call stack is:

  • std::thread::panicking, which is #[inline]
  • std::panicking::panicking, which is not #[inline]
  • std::panicking::panic_count::is_zero

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, that should probably be inline too.

if GLOBAL_PANIC_COUNT.load(Ordering::Relaxed) == 0 {
// Fast path: if `GLOBAL_PANIC_COUNT` is zero, all threads
// (including the current one) will have `LOCAL_PANIC_COUNT`
// equal to zero, so TLS access can be avoided.
true
} else {
is_zero_slow_path()
}
}

// Slow path is in a separate function to reduce the amount of code
// inlined from `is_zero`.
#[inline(never)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[cold] should be enough, as a general rule we avoid using both cold and inline(never).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that way because that is how it is done here:

rust/src/libcore/option.rs

Lines 1262 to 1267 in 0b66a89

#[inline(never)]
#[cold]
#[track_caller]
fn expect_failed(msg: &str) -> ! {
panic!("{}", msg)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I guess it doesn't really matter in practice.

#[cold]
fn is_zero_slow_path() -> bool {
LOCAL_PANIC_COUNT.with(|c| c.get() == 0)
}
}

#[cfg(test)]
pub use realstd::rt::update_panic_count;
pub use realstd::rt::panic_count;

/// Invoke a closure, capturing the cause of an unwinding panic if one occurs.
pub unsafe fn r#try<R, F: FnOnce() -> R>(f: F) -> Result<R, Box<dyn Any + Send>> {
Expand Down Expand Up @@ -284,7 +330,7 @@ pub unsafe fn r#try<R, F: FnOnce() -> R>(f: F) -> Result<R, Box<dyn Any + Send>>
#[cold]
unsafe fn cleanup(payload: *mut u8) -> Box<dyn Any + Send + 'static> {
let obj = Box::from_raw(__rust_panic_cleanup(payload));
update_panic_count(-1);
panic_count::decrease();
obj
}

Expand Down Expand Up @@ -313,8 +359,9 @@ pub unsafe fn r#try<R, F: FnOnce() -> R>(f: F) -> Result<R, Box<dyn Any + Send>>
}

/// Determines whether the current thread is unwinding because of panic.
#[inline]
pub fn panicking() -> bool {
update_panic_count(0) != 0
!panic_count::is_zero()
}

/// The entry point for panicking with a formatted message.
Expand Down Expand Up @@ -452,7 +499,7 @@ fn rust_panic_with_hook(
message: Option<&fmt::Arguments<'_>>,
location: &Location<'_>,
) -> ! {
let panics = update_panic_count(1);
let panics = panic_count::increase();

// If this is the third nested call (e.g., panics == 2, this is 0-indexed),
// the panic hook probably triggered the last panic, otherwise the
Expand Down Expand Up @@ -514,7 +561,7 @@ fn rust_panic_with_hook(
/// This is the entry point for `resume_unwind`.
/// It just forwards the payload to the panic runtime.
pub fn rust_panic_without_hook(payload: Box<dyn Any + Send>) -> ! {
update_panic_count(1);
panic_count::increase();

struct RewrapBox(Box<dyn Any + Send>);

Expand Down
2 changes: 1 addition & 1 deletion src/libstd/rt.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
#![doc(hidden)]

// Re-export some of our utilities which are expected by other crates.
pub use crate::panicking::{begin_panic, begin_panic_fmt, update_panic_count};
pub use crate::panicking::{begin_panic, begin_panic_fmt, panic_count};

// To reduce the generated code of the new `lang_start`, this function is doing
// the real work.
Expand Down