On server stability and isolation #2631

darksylinc · 2023-10-26T15:04:22Z

darksylinc
Oct 26, 2023

Hi!

I'm playing with rocket while I'm learning rust on the go with it.

I'm evaluating the possibility of replacing / migrating our current backend to a rust-based server code on the long term.

The first thing I noticed is that when a single site crashes, the whole server goes down. If I trigger a Rust panic, the server correctly returns a 500 error.

More specifically after I tried this:

// Try visiting:
//   http://127.0.0.1:8000/hello/world
#[get("/world")]
fn world() -> &'static str {
    unsafe { std::ptr::null_mut::<i32>().write(42) };
    "Hello, world!"
}

Visiting http://127.0.0.1:8000/hello/world resulted in the whole server going down.

Of course you may be thinking "you're using Rust wrong, you shouldn't use unsafe code".

Well, yes. But I'm trying to understand the code's architecture. It's likely that I will have to call external libraries while I'm migrating, which could crash (ideally it should not of course, but I am trying to cover all my bases).

Normally this sort of problems would be handled by isolating the server from the page-processing code, thus if the page-processor crashes, it only affects that connection (or a bunch if pooled) but not the entire server.

Thus my questions are:

Does Rocket provide any way to isolate server from page-processing?
- If so, are there any examples?
- If that requires launching another rust app, that's fine (as long as resource usage and latency to spawn that process is low). I can use that strategy only for unsafe code.
Is there something on the roadmap for this? or is this beyond Rocket's scope?
Is it possible to crash the Rocket server using safe code only? (ignoring HW errors / faulty HW and out of memory)

Thanks.

Answered by SergioBenitez

Oct 26, 2023

Of course you may be thinking "you're using Rust wrong, you shouldn't use unsafe code".

There's nothing wrong about using unsafe code, but incorrect unsafe code, like the one you've written here, is likely to trigger undefined behavior. This isn't just a Rust issue: the exact same code translated to C would also yield undefined behavior. By definition, this makes it impossible to write a server that can fault-isolate this behavior, at least in a single memory security domain (i.e, process). Even with software fault isolation techniques with multiple security domains, you cannot precisely isolate the issues you're suggesting, at least not without incurring very significant performance ov…

View full answer

SergioBenitez · 2023-10-26T17:06:15Z

SergioBenitez
Oct 26, 2023
Maintainer

Of course you may be thinking "you're using Rust wrong, you shouldn't use unsafe code".

There's nothing wrong about using unsafe code, but incorrect unsafe code, like the one you've written here, is likely to trigger undefined behavior. This isn't just a Rust issue: the exact same code translated to C would also yield undefined behavior. By definition, this makes it impossible to write a server that can fault-isolate this behavior, at least in a single memory security domain (i.e, process). Even with software fault isolation techniques with multiple security domains, you cannot precisely isolate the issues you're suggesting, at least not without incurring very significant performance overhead.

Normally this sort of problems would be handled by isolating the server from the page-processing code, thus if the page-processor crashes, it only affects that connection (or a bunch if pooled) but not the entire server.

No, this sort of problem is not handled in a way you suggest in practice. Instead, it's considered a serious mistake and fixed at the root. Which brings me to...

I'm evaluating the possibility of replacing / migrating our current backend to a rust-based server code on the long term.

The first thing I noticed is that when a single site crashes, the whole server goes down.

What is your current backend (language and framework), and how does it gracefully handle the undefined behavior you're exhibiting in your issue? As I understand it, no language or framework exists, outside of research and academia with large performance overhead, that can truly fault isolate the example you've provided. Starting a new process to handle each request, and ensuring that no state from the new process makes its way back to the main application, would mean that your entire server likely doesn't crash under such a condition, but it would make that handler impossible to execute in general. And "no state making its way back to the main application" is likely unachievable for any real application.

Does Rocket provide any way to isolate server from page-processing?

I assume you mean isolating route handling since the concept of "page-processing" isn't present in Rocket. Every route execution is already isolated from every other route's execution to the extent that the language allows.

Is there something on the roadmap for this? or is this beyond Rocket's scope?

You appear to be asking a project written in Rust to gracefully handle code that exhibits undefined behavior and violates the core principles of Rust. This is unlikely to be within the scope of any project written in Rust, or any safe (or unsafe) memory language for that matter.

Is it possible to crash the Rocket server using safe code only? (ignoring HW errors / faulty HW and out of memory)

Using the usual definitions, the answer to your question is no: we consider any (well defined) mechanism that can make your server crash a bug. Based on your issue, however, you seem to include undefined mechanisms in your domain of "things that could make your program crash." In which case, yes: there are bugs in the Rust compiler. If you purposefully exploit them, you can violate the language's semantics and raise a memory exception, all without using unsafe.

0 replies

darksylinc · 2023-10-26T17:59:15Z

darksylinc
Oct 26, 2023
Author

Hi!

Thanks for answering so quickly!

There's nothing wrong about using unsafe code, but incorrect unsafe code, like the one you've written here, is likely to trigger undefined behavior. This isn't just a Rust issue: the exact same code translated to C would also yield undefined behavior. By definition, this makes it impossible to write a server that can fault-isolate this behavior, at least in a single memory security domain (i.e, process). Even with software fault isolation techniques with multiple security domains, you cannot precisely isolate the issues you're suggesting, at least not without incurring very significant performance overhead.

Well that's strictly true. Specially if an attacker is trying to purposely exploit a bug.

However what I am more concerned about is the most common case in which, when things go wrong, ends up in a crash without lasting side effects and how to recover from it while minimizing damage* to other clients currently connected.

* By damage I mean the other users don't start getting connection errors because some random person accidentally triggered a server crash.

What is your current backend (language and framework), and how does it gracefully handle the undefined behavior you're exhibiting in your issue?

I didn't want to mention it, because it's laughably bad. The current backend is mostly PHP + Apache + MySQL.

The biggest source of troubles that I suspect could be:

Use of libcurl (or whatever replaces it, hopefully a rust-equivalent). Our server-side of Libcurl isn't much. But it is a lib written in C, and although very stable, could lead to unknowns.
Dealing w/ MySQL connections (I haven't researched this yet. Hopefully a pure-rust implementation would avoid having to worry about unsafe code)
Dealing w/ certain cryptographic code (PHP mostly delegates this to either libsodium and openssl, a big source of headaches)

Starting a new process to handle each request, and ensuring that no state from the new process makes its way back to the main application, would mean that your entire server likely doesn't crash under such a condition, but it would make that handler impossible to execute in general. And "no state making its way back to the main application" is likely unachievable for any real application.

I'm thinking worst comes to worst, I can make a pool of processes to handle whatever dangerous (if any) and use IPC to talk back and forth. Basically what I was asking is if Rocket handled this automatically, and the answer seems to be no (what I wanted to know).

That's pretty much what Firefox and Chrome do to isolate web pages from the main system (i.e. a page crashing doesn't bring the entire browser down).

Another simply solution would be to run two (or more) instances of Rocket on different ports (we are not using HTTPS server for regular browsers, it mostly talks to a custom client).

Using the usual definitions, the answer to your question is no: we consider any (well defined) mechanism that can make your server crash a bug. Based on your issue, however, you seem to include undefined mechanisms in your domain of "things that could make your program crash." In which case, yes: there are bugs in the Rust compiler. If you purposefully exploit them, you can violate the language's semantics and raise a memory exception, all without using unsafe.

Thanks, that basically answers the question. Sounds reasonable.

1 reply

the10thWiz Nov 8, 2023
Maintainer

To your three sources:

Pure Rust HTTP library: reqwest
Pure Rust MySql: take a look at rocket_sync_db_pools and rocket_db_pools (at least on the pre-releases) for a variety of pure Rust options. I prefer sqlx, but there are several other options that integrate well with Rocket.
For most cryptographic things, you can choose from a variety of pure Rust options, but they tend to be much smaller than OpenSSL. Depending on your use-case you might need to include multiple libraries, but there is a wealth of verified and tested cryptographic libraries available in the Rust ecosystem.

Rust (and Rocket) do a remarkably good job of preventing server crashes due to coding mistakes. Given the three points above, it should be relatively easy to avoid using any unsafe in your code, providing the Rust compiler's strongest guarantees about your code. A similar (but safe) example to the one you originally provided, where you crash via a panic, is actually handled (and quite gracefully) by Rocket. In my experience, I've never run into a compiler bug myself, esp. not on a stable branch, and there is much work put into the compiler to ensure it will enforce and uphold the guarantee that your code will not crash except via a panic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On server stability and isolation #2631

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

On server stability and isolation #2631

darksylinc Oct 26, 2023

Replies: 2 comments · 1 reply

SergioBenitez Oct 26, 2023 Maintainer

darksylinc Oct 26, 2023 Author

the10thWiz Nov 8, 2023 Maintainer

darksylinc
Oct 26, 2023

Replies: 2 comments 1 reply

SergioBenitez
Oct 26, 2023
Maintainer

darksylinc
Oct 26, 2023
Author

the10thWiz Nov 8, 2023
Maintainer