Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM/stack overflow crash observed when porting tokio 0.1 app to tokio 1.14 (uses tokio::process and grpcio) #4309

Closed
iitalics opened this issue Dec 9, 2021 · 19 comments
Labels
A-tokio Area: The main tokio crate C-bug Category: This is a bug.

Comments

@iitalics
Copy link

iitalics commented Dec 9, 2021

Version
Tokio 1.14.0

Platform
Linux [...] SMP Thu Dec 9 04:33:29 UTC 2021 armv7l GNU/Linux
Rust 1.55

Description
I'm going to do my best to report an issue I've been trying to track down for about the last month. I've been porting a medium size application from tokio 0.1 (rustc 1.46) to tokio 1.14 (rustc 1.55), and am now seeing rare but reoccuring OOM crashes; it looks like a stack overflow, but I have little to go by from looking at the core dumps (comes from a running armv7 device). The port itself is not very substantial since most of the API is the same, but this crash is new. It happens infrequently but common enough to suggest something is wrong, so I'm inclined to believe a race condition is involved.

I have been able to identify the moment it crashes is during tokio::process::Child::wait_with_output(), which I invoke like this:

let cmd = tokio::process::Command::new(...)...;
futures::future::lazy(|_| cmd.spawn())
  .and_then(|mut child| {
    let mut stdin = child.stdin.take().unwrap();
    stdin
      .write_all(...)
      .and_then(move |_| child.wait_with_output())
  })
  .then(|result| ...);

I've observed that if I reimplement wait_with_output(), but drop the stdout/stderr handles after wait() has finished, the crash stops. But Rust community Discord users have suggested to me that this is most likely just a symptom of the underlying problem.

One factor that may be contributing to this crash is that I am using gprcio alongside tokio in the application. I've attempted to take care to not mix executors, yet I am still seeing this issue. I'm going to try to explain how grpcio is used for some extra context in diagnosing the problem.

I am instantiating the tokio runtime using tokio::runtime::Runtime::new(), and then passing the Handle to the gRPC service handler, which handles requests like this (simplified):

struct FooService {
  app: Arc<Mutex<Application>>,
  handle: tokio::runtime::Handle,
}

impl FooRpc for FooService {
  fn foo(&mut self, ctx: RpcContext, req: FooRequest, sink: UnarySink<FooResponse>) {
    let (resp_tx, resp_rx) = futures::channel::oneshot::channel();
    let resp_future = self.app.lock().unwrap().do_foo(req);
    self.handle.spawn(
      resp_future.map(|resp| resp_tx.send(resp))
    );
    ctx.spawn(
      resp_rx.then(|resp| sink.success(resp.unwrap()).map(|_| ()))
    );
  }
}

impl Application {
  fn do_foo(&mut self, req: FooRequest) -> impl Future<Output = FooResponse> { ... }
}

The gRPC server is started by calling grpcio::Server::start() in main() (not using #[tokio::main]), which then blocks to run the event loop by passing a futures::channel::oneshot to Rutime::block_on() that is only ever completed after ctrlc signal is caught.

Let me know if you need any other details and I will try my best to provide them.

Misc. notes:

  • There is no unsafe code.
  • I have unfortunately not been able to reproduce this crash on my local machine or without running a sophisticated stress-testing routine to provoke it.
  • While it looks like the crash is a stack overflow, the program uses very little memory under normal operating conditions, then suddenly crashes. I have tried to monitor the stack size with stacker but the logging statements inserted were making the crash significantly less frequent, leading me to further believe it is due to a race condition.
  • Users on the Rust community Discord suggested that this behavior could be related to tokio's "coop budgeting" interacting badly with how grpcio polls futures. I'm not an expert on executor internals so the implications here are beyond me.
@iitalics iitalics added A-tokio Area: The main tokio crate C-bug Category: This is a bug. labels Dec 9, 2021
@Darksonn
Copy link
Contributor

Darksonn commented Dec 9, 2021

The gRPC server is started by calling grpcio::Server::start() in main() (not using #[tokio::main]), which then blocks to run the event loop by passing a futures::channel::oneshot to Runtime::block_on() that is only ever completed after ctrlc signal is caught.

Can you elaborate on where exactly this block_on call occurs?

@Darksonn
Copy link
Contributor

Darksonn commented Dec 9, 2021

@ipetkov This appears to be related to tokio::process. Thoughts on the above?

@iitalics
Copy link
Author

iitalics commented Dec 9, 2021

Can you elaborate on where exactly this block_on call occurs?

It looks effectively like this:

fn entrypoint() -> Result<(), AppError> {
   let tokio_runtime = tokio::runtime::Runtime::new()?;
   ...
   grpcio_server.start();
   let shutdown_fut = ctrlc_rx.map(|_| Ok(()));
   tokio_runtime.block_on(shutdown_fut)
}

fn main() {
  std::process::exit(match entrypoint() {
    Ok(_) => 0,
    Err(_) => ...
  })
}

@Darksonn
Copy link
Contributor

Darksonn commented Dec 9, 2021

Right, so grpcio_server is started in the background and the start call does not actually block? If so, and if you have no other uses of block_on, then coop is not related.

@Darksonn
Copy link
Contributor

Darksonn commented Dec 9, 2021

How are you reimplementing wait_with_output?

@iitalics
Copy link
Author

iitalics commented Dec 9, 2021

How are you reimplementing wait_with_output?

Here is what was working (exact code): https://gist.github.com/iitalics/710401de72725fa4b6db27742124ae24

@Darksonn
Copy link
Contributor

Darksonn commented Dec 9, 2021

The broken version using wait_with_output appears incomplete, e.g. I am surprised that the borrow-checker is letting you return the write_all future from the and_then call when write_all only borrows stdin and stdin is in a local variable in the closure.

Anyway, it seems like one difference between the two snippets is that your original snippet using wait_with_output does not appear to be closing stdin before calling wait_with_output.

@iitalics
Copy link
Author

iitalics commented Dec 9, 2021

Right, that code is paraphrased, I forgot to make it borrow check compliant. I fix the borrow checking issue by using an async move block so its actually like this:

let fut = Box::pin(async move { stdin.write_all(input.as_bytes()).await });
fut.and_then(move |_| child.wait_with_output())

@iitalics
Copy link
Author

iitalics commented Dec 9, 2021

A version of communicate() which drops the stdout/stderr handles inside read_*_fut block also causes the crash to happen (at at that point it is nearly identical to wait_with_output())

@ipetkov
Copy link
Member

ipetkov commented Dec 10, 2021

First thing that comes to mind: have you tried splitting out the stdin.write and stdout.read into two separate futures? I suppose this is what you're essentially doing with the communicate() reimplementation, but I think the culprit could be that the call to child.wait_with_output is not started until write_all completes.

My guidance for piping input/output of a child process is always do them independently, otherwise if the child hangs because its stdout buffer is full (parent isn't reading from it yet!) it may stop reading its input, in which case the parent could get stuck writing to it. And if the parent is buffering data to send to the child with no back-pressure it could be holding that data for too long and running out of memory.

Could you try changing the code as follows and see if the OOM reproduces still?

let cmd = tokio::process::Command::new(...)...;
futures::future::lazy(|_| cmd.spawn())
  .and_then(|mut child| {
    let mut stdin = child.stdin.take().unwrap();
    tokio::spawn(async move { stdin.write_all(...).unwrap() });

    child.wait_with_output()
  })
  .then(|result| ...);

@iitalics
Copy link
Author

I'm testing this right now. But the input buffer is completely in memory beforehand (see how my communicate() takes String argument) --- so I would expect this to deadlock, not crash.

@iitalics
Copy link
Author

I've just observed the crash again when using the following:

async move {
    tokio::spawn(async move {
        stdin.write_all(input.as_bytes()).await
    });
    child.wait_with_output().await
}

@ipetkov
Copy link
Member

ipetkov commented Dec 10, 2021

Okay, thanks for testing that. Next thing that comes to mind: do you have an idea of how much data the child is writing back on stdout/stderr? Do you need all of that data present within the parent process or could that be routed elsewhere?

wait_with_output will basically buffer everything for as long as the child process continues to write data.

My first recommendation is to avoid having the parent read any streams it doesn't care about (e.g. redirect stderr to /dev/null or to a file directly). My second recommendation would be to change the application so that it can process the data in a more streaming fashion instead of capturing all the output up front, so that it can discard data as quickly as possible

@iitalics
Copy link
Author

I mentioned before that this crash only appeared after having upgraded the tokio infrastructure of the application. Before, there is no problem with collecting all subprocess output -- behavior I'm not interested in changing.

@Darksonn
Copy link
Contributor

We aren't saying that your application is doing something incorrect here — it certainly sounds like a bug in Tokio, but it would be helpful to narrow down the issue if you can test that the issue still happens even if you stop reading from the pipe once the vector becomes super large. Something like this:

let read_stderr_fut = async {
    let mut buf = Vec::new();
    while buf.len() < 50_000_000 {
        stderr.read_buf(&mut buf).await?;
    }
    std::io::Result::<_>::Ok(buf)
};

@Darksonn
Copy link
Contributor

I have opened a PR that changes wait_with_output to the version that doesn't crash for you, but I would still really want to understand what is going on here.

@iitalics
Copy link
Author

Running some tests today but its taking forever because the problem tends to go away when I add more logs. FWIW the largest stderr we ever see is <256 bytes.

@iitalics
Copy link
Author

iitalics commented Dec 14, 2021

Update: I was running a new test which was doing read_buf in a loop. It would crash until I turned logging on, then it would stop crashing. Then I realized something: I had a logging statement in between receiving EOF and closing the pipe:

loop {
  let n = stderr.read_buf(&mut buf).await?;
  warn!(...);
  if n == 0 {
    drop(stderr); // (implicit)
    return Ok(buf);
  }
}

The logging probably added a delay that made it so close() would occur after the process had exited, mimicking the working version where we explicitly drop the pipes at the end. Sure enough, when I rearrange the log to happen after the if (thus nothing happens between EOF and close()) then I can reproduce the crash.

@ipetkov
Copy link
Member

ipetkov commented Dec 14, 2021

Thanks for the analysis @iitalics! Now I'm really curious about the epoll and kernel interactions around closing descriptors and other processes, but at least we can merge #4315 with the confidence that it should mitigate this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate C-bug Category: This is a bug.
Projects
None yet
Development

No branches or pull requests

3 participants