-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nightly rust hangs forever #46449
Comments
Is that a git bisect result for the commit id or something else, it's a little unclear from the comments here. If not then I would recommend running git bisect on the nightly build that's failing and report it back. After that me and others can figure out if reverted is possible or works cleanly. If not then a fix should be applied based on what git bisect returns. |
I found this problem after It would be the last version before Nov. 1. I guess. |
Actually you can run git bisect on the git repo here: https://github.com/rust-lang/rust/commits/master. As it's up to date. Just clone it and build it. Actually you don't need commit ids, that makes it run faster a few less bisections. If you run git bisect bad or good without commit for both, it runs it against the complete history rather than a range of commits. |
Last few lines of @xerofoify Generally running
|
Sorry about that, wasn't sure what the protocol for git bisect was here :). |
@Mark-Simulacrum I don't understand the log, is LLVM just slower, or does it end up in some loop? |
It's possible it's just slower, I didn't try looking at what was happening with gdb or anything. The memory usage did seem to keep growing so it presumably as doing something.... |
It will hang forever, but still consums CPU. ps shows a rustc process: zonyitoo 14287 98.3 0.0 25306648 1012 s001 R+ 10:36下午 17:20.50 /Users/zonyitoo/.rustup/toolchains/nightly-x86_64-apple-darwin/bin/rustc --crate-name shadowsocks Is what the log states. We should check if the build for rust supports it this occurs under gcc as it if does then we have a issue with us and the compiler. I am curious what is meant by this through: in this commit, #45225? |
The error only happens when optimization level ≥2. Number of CGUs (ThinLTO) is irrelevant. Most time is spent on SROALegacyPass |
I've enabled some debug log for the SROA pass and looks like there's difficulty working with
|
Ok at least from my knowledge this is possibly a issue with LLVM in optimizations if it's failing like that. What version of LLVM is being used here. Maybe a newer version fixes it from upstream already? |
I don't know if OSX is different, but I get this instead:
|
LLVM assertions in nightlies are disabled, so normal users won't see this... |
We should do bisections with the nightlies that have the assertions enabled (cc @rust-lang/infra). |
Backtrace for that OOM (maybe 16GB just aren't enough?):
|
|
I'm marking it O-macos for now, as just disabling LTO and adding Edit: Not an -alt build, just regular nightly. |
So there are multiple bugs here? Have you been trying the |
@eddyb I found one part which is highly relevant to #45225. In const BUFFER_SIZE: usize = 8 * 1024; // 8K buffer which is used in several structures, e.g. pub struct CopyTimeout<R, W>
where
R: AsyncRead,
W: AsyncWrite,
{
r: Option<R>,
w: Option<W>,
timeout: Duration,
amt: u64,
timer: Option<Timeout>,
buf: [u8; BUFFER_SIZE], // <----
pos: usize,
cap: usize,
} Reducing the
Note that the time spent in stable is constant, but that for beta and nightly is superlinear. I'm still reducing the program but this structure (together with some interactions with Futures) should be one contribution of the cause that hangs LLVM and eats up all the memory. |
Is it used with enums or directly in argument/return types? |
It's possible |
Are you able to give us the output of the assembly itself for what you stated may be causing the problem. It seems that SROAing may happen for arrays or objects. |
EDIT: Disregard, I haven't compared this with stable branch. It maybe just inherent slowness irrelevant to the regression. #![forbid(warnings)]
extern crate futures;
extern crate tokio_core;
extern crate tokio_io;
mod config {
use std::net::SocketAddr;
pub struct ServerAddr;
impl ServerAddr {
pub fn listen_addr(&self) -> &SocketAddr {
loop {}
}
}
pub struct ServerConfig;
impl ServerConfig {
pub fn addr(&self) -> &ServerAddr {
loop {}
}
}
pub struct Config {
pub server: Option<ServerConfig>,
}
}
pub mod relay {
use std::io;
use config::Config;
use futures::Future;
use tokio_core::reactor::Handle;
pub mod tcprelay {
use std::io;
use relay::{boxed_future, BoxIoFuture};
use futures::{Future};
pub mod server {
use std::io;
use relay::BoxIoFuture;
use relay::Context;
use futures::Future;
use futures::stream::Stream;
use tokio_core::net::{TcpListener};
use super::{tunnel, EncryptedHalfFut};
use super::utils::CopyTimeoutOpt;
struct TcpRelayClientHandshake;
impl TcpRelayClientHandshake {
fn handshake(self) -> BoxIoFuture<TcpRelayClientPending> {
loop {}
}
}
struct TcpRelayClientPending;
impl TcpRelayClientPending {
fn connect(self) -> BoxIoFuture<TcpRelayClientConnected> {
loop {}
}
}
struct TcpRelayClientConnected;
fn copy_timeout_opt() -> CopyTimeoutOpt {
loop {}
}
fn client() -> EncryptedHalfFut {
loop {}
}
impl TcpRelayClientConnected {
fn tunnel(self) -> BoxIoFuture<()> {
tunnel(copy_timeout_opt(), client())
}
}
pub fn run() -> Box<Future<Item = (), Error = io::Error>> {
let mut fut: Option<Box<Future<Item = (), Error = io::Error>>> = None;
Context::with(|ctx| {
let config = ctx.config();
for svr_cfg in &config.server {
let listener = {
let addr = svr_cfg.addr();
let addr = addr.listen_addr();
let listener =
TcpListener::bind(&addr, ctx.handle()).unwrap_or_else(|_| loop {});
listener
};
let listening = listener.incoming().for_each(move |_| {
let client = TcpRelayClientHandshake {};
let fut = client
.handshake()
.and_then(|c| c.connect())
.and_then(|c| c.tunnel())
.map_err(|_| ());
Context::with(|ctx| ctx.handle().spawn(fut));
Ok(())
});
fut = Some(Box::new(listening)
as Box<Future<Item = (), Error = io::Error>>)
}
loop {}
})
}
}
mod utils {
use std::io;
use futures::{Future, Poll};
pub struct CopyTimeoutOpt {
_buf: [u8; 8192],
}
impl Future for CopyTimeoutOpt {
type Item = (u64, (), ());
type Error = io::Error;
fn poll(&mut self) -> Poll<Self::Item, Self::Error> {
loop {}
}
}
}
struct EncryptedHalf;
type EncryptedHalfFut = BoxIoFuture<EncryptedHalf>;
fn tunnel<CF, CFI, SF, SFI>(c2s: CF, s2c: SF) -> BoxIoFuture<()>
where
CF: Future<Item = CFI, Error = io::Error> + 'static,
SF: Future<Item = SFI, Error = io::Error> + 'static,
{
let c2s = c2s.then(move |res| match res {
Ok(..) => Ok(()),
Err(err) => Err(err),
});
let s2c = s2c.then(move |res| match res {
Ok(..) => Ok(()),
Err(err) => Err(err),
});
let fut = c2s.select(s2c)
.and_then(move |_| Ok(()))
.map_err(|(err, _)| err);
boxed_future(fut)
}
}
type BoxIoFuture<T> = Box<Future<Item = T, Error = io::Error>>;
fn boxed_future<T, E, F>(f: F) -> Box<Future<Item = T, Error = E>>
where
F: Future<Item = T, Error = E> + 'static,
{
Box::new(f)
}
thread_local!(static CONTEXT: Context = loop {});
struct Context;
impl Context {
fn with<F, R>(f: F) -> R
where
F: FnOnce(&Context) -> R,
{
CONTEXT.with(f)
}
fn handle(&self) -> &Handle {
loop {}
}
fn config(&self) -> &Config {
loop {}
}
}
} Expected timing of the code above (replace the "8192" by a number below):
Looks like quadratic or cubic. |
cc @rust-lang/compiler Should we untag as a regression from stable to beta given that this is a longstanding issue in LLVM? Either way we should separate the LLVM bug and me triggering it. |
Add test case for rust-lang/rust#46449
@serge-sans-paille in LLVM IRC came up with this patch https://reviews.llvm.org/D41296, which solves at least the quadratic slowdown caused by SROA trying to generate fresh value names. EDIT: it makes one testcase go from EDIT2: the original crate finishes a full rebuild in |
rustc_trans: approximate ABI alignment for padding/union fillers. Before #45225 and after this PR, unions and enums are filled with integers of size and alignment matching their alignment (e.g. `Option<u32>` becomes `[u32; 2]`) instead of mere bytes. Also, the alignment padding between struct fields gets this treatment after this PR. Partially helps with some reduced testcases in #46449, although it doesn't solve the bug itself.
https://reviews.llvm.org/D41296 is still unreviewed. Does someone know who to poke? EDIT: The layout PR has been taken out of the current beta, for now, and the LLVM patch is now getting reviewed (and generalized beyond SROA). |
If you don't need to debug the value names at the IR level, you can set a flag on the LLVMContext to discard them automatically, it would much improve the performance. |
Bounce out the layout refactor from beta @eddyb's #45225 was supposed to get into get into 1.24, but due to an ordering mistake, it had snuck into 1.23. That wide-effect translation-changing PR had poked LLVM's weak corners and caused many regressions (3 of them have fixes I include here, but also #46897, #46845, #46449, #46371). I don't think it is a good idea to land it in the beta (1.23) because there are bound to be some regressions we didn't patch. Therefore, I am reverting it in time for stable, along with its related regression fixes. r? @michaelwoerister (I think)
triage: P-high I'm bumping this up to P-high -- it affects the upcoming beta. @eddyb can you give us a status update of the LLVM bug? |
We will want to land the LLVM patch into our fork either way (we haven’t yet upgraded to our own LLVM to 5.0 yet, IIRC and this would land against 6.0 upstream). That being said, disabling name generation in LLVM when |
Using LLVM context configured to not retain any names brings the compilation time back to a fairly reasonable
compared to
on stable. Not sure if compilation process is doing equivalent amount of work between stable and nightly here (I’m sure the majority of the improvement is from codegen-units), but at least the compilation does not take forever on my branch now! Will cleanup and submit a PR tomorrow. |
For the record, https://reviews.llvm.org/D41296 got merged recently and should limit the memory consumption on debug build too. |
Use name-discarding LLVM context This is only applicable when neither of --emit=llvm-ir or --emit=llvm-bc are not requested. In case either of these outputs are wanted, but the benefits of such context are desired as well, -Zfewer_names option provides the same functionality regardless of the outputs requested. Should be a viable fix for rust-lang#46449
Yeah, I can compile current shadowsocks master just fine. |
Thank you all !!! |
Consider changing assert! to debug_assert! when it calls visit_with The perf run from rust-lang#52956 revealed that there were 3 benchmarks that benefited most from changing `assert!`s to `debug_assert!`s: - issue rust-lang#46449: avg -4.7% for -check - deeply-nested (AKA rust-lang#38528): avg -3.4% for -check - regression rust-lang#31157: avg -3.2% for -check I analyzed their fixing PRs and decided to look for potentially heavy assertions in the files they modified. I noticed that all of the non-trivial ones contained indirect calls to `visit_with()`. It might be a good idea to consider changing `assert!` to `debug_assert!` in those places in order to get the performance wins shown by the benchmarks.
Consider changing assert! to debug_assert! when it calls visit_with The perf run from rust-lang#52956 revealed that there were 3 benchmarks that benefited most from changing `assert!`s to `debug_assert!`s: - issue rust-lang#46449: avg -4.7% for -check - deeply-nested (AKA rust-lang#38528): avg -3.4% for -check - regression rust-lang#31157: avg -3.2% for -check I analyzed their fixing PRs and decided to look for potentially heavy assertions in the files they modified. I noticed that all of the non-trivial ones contained indirect calls to `visit_with()`. It might be a good idea to consider changing `assert!` to `debug_assert!` in those places in order to get the performance wins shown by the benchmarks.
When I tried to use nightly rust to compile my project shadowsocks-rust, it will hang forever.
Stable version works fine.
Reproduce steps:
It will hang forever, but still consumes CPU.
ps
shows arustc
process:Meta
cargo 0.24.0-nightly (6529d418d 2017-11-29)
The text was updated successfully, but these errors were encountered: