Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inadequately large Timeout's size #5348

Closed
loyd opened this issue Jan 5, 2023 · 3 comments · Fixed by #5468
Closed

Inadequately large Timeout's size #5348

loyd opened this issue Jan 5, 2023 · 3 comments · Fixed by #5468
Labels
A-tokio Area: The main tokio crate C-bug Category: This is a bug. M-time Module: tokio/time

Comments

@loyd
Copy link
Contributor

loyd commented Jan 5, 2023

Version
1.23.1

Description
I tried the following code:

async fn make_request() -> u32 { 42 }

async fn plain() -> u32 {
    make_request().await
}

async fn timeout() -> u32 {
    tokio::time::timeout(std::time::Duration::from_secs(1), make_request()).await.unwrap()
}

fn main() {
    println!("plain:   {}", std::mem::size_of_val(&plain()));   // => 2
    println!("timeout: {}", std::mem::size_of_val(&timeout())); // => 896
}

Debug and release builds print the same numbers.

It can affect performance even if timeout() is used rarely (e.g. conditionally). For instance, it degraded the performance (~18%) of my DB client until I boxed Sleep.

@loyd loyd added A-tokio Area: The main tokio crate C-bug Category: This is a bug. labels Jan 5, 2023
@w41ter
Copy link

w41ter commented Jan 5, 2023

The cache line alignment of timeout entry caused it.

@Darksonn Darksonn added the M-time Module: tokio/time label Jan 5, 2023
@dead-claudia
Copy link

dead-claudia commented Jan 26, 2023

I've done some digging after noticing similar issues with tokio::time::Sleep requiring a whopping 640 bytes on x86-64, itself aligned to 128 bytes. Here's the memory layout I'm seeing:

  • 0-512: sleep.entry: TimerEntry: min alignment 128, size 8
    • 0-383: sleep.entry.inner: StdUnsafeCell<TimerShared>, repr(C) and so can't be re-ordered, min alignment 128, size 8
      • 0-127: sleep.entry.inner.driver_state: CachePadded<TimerSharedPadded>, min alignment 128, size 32
        • 0-15: sleep.entry.inner.driver_state.pointers: linked_list::Pointers<TimerShared>, min alignment 8, size 16
          • 0: sleep.entry.inner.driver_state.pointers.prev: Option<NonNull<TimerShared>>, min alignment 8, size 8
          • 8: sleep.entry.inner.driver_state.pointers.next: Option<NonNull<TimerShared>>, min alignment 8, size 8
        • 16-23: sleep.entry.inner.driver_state.cached_when: AtomicU64, min alignment 8, size 8
        • 24-31: sleep.entry.inner.driver_state.true_when: AtomicU64, min alignment 8, size 8
      • 128-383: sleep.entry.inner.state: StateCell, min alignment: 128, size 256
        • 128-135: sleep.entry.state.state: AtomicU64, min alignment: 8, size 8
        • 136: sleep.entry.inner.state.result: UnsafeCell<TimerResult>, min alignment: 1, size 1
        • 256-383: sleep.entry.inner.state.waker: CachePadded<AtomicWaker>, min alignment: 128, size 128
          • 256-263: sleep.entry.inner.state.waker.state: AtomicUsize, min alignment: 8, size 8
          • 264-279: sleep.entry.inner.state.waker.waker: UnsafeCell<Option<Waker>>, min alignment: 8, size 16
    • 384-399: sleep.entry.driver: scheduler::Handle, min alignment: 8, size 16
    • 400-415: sleep.entry.initial_deadline: Option<Instant>, min alignment: 4, size 16
  • 512-639: sleep.inner: Inner: min alignment 128, size 12
    • 512-523: sleep.inner.deadline: Instant: min alignment 4, size 12

If you trim away the padding, here's all that's required:

  • 0-103: sleep.entry: TimerEntry: min alignment 8, size 8
    • 0-71: sleep.entry.inner: StdUnsafeCell<TimerShared>, repr(C) and so can't be re-ordered, min alignment 128, size 8
      • 0-31: sleep.entry.inner.driver_state: TimerSharedPadded, min alignment 8, size 32
        • 0-15: sleep.entry.inner.driver_state.pointers: linked_list::Pointers<TimerShared>, min alignment 8, size 16
          • 0: sleep.entry.inner.driver_state.pointers.prev: Option<NonNull<TimerShared>>, min alignment 8, size 8
          • 8: sleep.entry.inner.driver_state.pointers.next: Option<NonNull<TimerShared>>, min alignment 8, size 8
        • 16-23: sleep.entry.inner.driver_state.cached_when: AtomicU64, min alignment 8, size 8
        • 24-31: sleep.entry.inner.driver_state.true_when: AtomicU64, min alignment 8, size 8
      • 32-71: sleep.entry.inner.state: StateCell, min alignment: 128, size 256
        • 32-39: sleep.entry.state.state: AtomicU64, min alignment: 8, size 8
        • 40-47: sleep.entry.inner.state.result: UnsafeCell<TimerResult>, min alignment: 1, size 1
        • 48-71: sleep.entry.inner.state.waker: AtomicWaker, min alignment: 8, size 24
          • 48-55: sleep.entry.inner.state.waker.state: AtomicUsize, min alignment: 8, size 8
          • 56-71: sleep.entry.inner.state.waker.waker: UnsafeCell<Option<Waker>>, min alignment: 8, size 16
    • 72-87: sleep.entry.driver: scheduler::Handle, min alignment: 8, size 16
    • 88-103: sleep.entry.initial_deadline: Option<Instant>, min alignment: 4, size 16
  • 104-115: sleep.inner: Inner: min alignment 4, size 12
    • 104-115: sleep.inner.deadline: Instant: min alignment 4, size 12

I'm skeptical cache padding would yield any benefit there, to be quite honest. If anything, I'd expect it to actually perform slower in this case than just removing the cache alignment altogether, just due to the fact it requires not one, but 5 cache lines on x86-64 (and AArch64 and 64-bit PowerPC).

@Darksonn
Copy link
Contributor

I would be happy to remove the cache padding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate C-bug Category: This is a bug. M-time Module: tokio/time
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants