Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap libraries in linker groups, allowing backwards/circular references #85805

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 22 additions & 69 deletions compiler/rustc_codegen_ssa/src/back/link.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1810,11 +1810,18 @@ fn linker_with_args<'a, B: ArchiveBuilder<'a>>(
add_sanitizer_libraries(sess, crate_type, cmd);

// Object code from the current crate.
// Take careful note of the ordering of the arguments we pass to the linker
// here. Linkers will assume that things on the left depend on things to the
// right. Things on the right cannot depend on things on the left. This is
// all formally implemented in terms of resolving symbols (libs on the right
// resolve unknown symbols of libs on the left, but not vice versa).
//
// Some native libraries can have circular dependencies, as do some internal
// crates in the standard library (notably libcore and libstd via "weak lang
// items" like `rust_begin_unwind`). And even when they don't, we don't want
// to force crates to provide native libraries in dependency order. So, we
// provide all libraries within a linker group. This ensures that symbols in
// all libraries resolve correctly regardless of library order.
//
// However, we still take care to pass arguments to the linker in a
// topologically sorted order, because that preserves expected symbol
// resolution ordering between libraries even if multiple libraries may
// provide the same symbol.
//
// For this reason, we have organized the arguments we pass to the linker as
// such:
Expand Down Expand Up @@ -1856,18 +1863,24 @@ fn linker_with_args<'a, B: ArchiveBuilder<'a>>(
// This change is somewhat breaking in practice due to local static libraries being linked
// as whole-archive (#85144), so removing whole-archive may be a pre-requisite.
if sess.opts.debugging_opts.link_native_libraries {
cmd.group_start();
add_local_native_libraries(cmd, sess, codegen_results);
cmd.group_end();
}

// Rust libraries.
cmd.group_start();
add_upstream_rust_crates::<B>(cmd, sess, codegen_results, crate_type, tmpdir);
cmd.group_end();

// Native libraries linked with `#[link]` attributes at and `-l` command line options.
// If -Zlink-native-libraries=false is set, then the assumption is that an
// external build system already has the native dependencies defined, and it
// will provide them to the linker itself.
if sess.opts.debugging_opts.link_native_libraries {
cmd.group_start();
add_upstream_native_libraries(cmd, sess, codegen_results, crate_type);
cmd.group_end();
}

// Library linking above uses some global state for things like `-Bstatic`/`-Bdynamic` to make
Expand Down Expand Up @@ -2114,64 +2127,9 @@ fn add_upstream_rust_crates<'a, B: ArchiveBuilder<'a>>(
// crates.
let deps = &codegen_results.crate_info.used_crates_dynamic;

// There's a few internal crates in the standard library (aka libcore and
// libstd) which actually have a circular dependence upon one another. This
// currently arises through "weak lang items" where libcore requires things
// like `rust_begin_unwind` but libstd ends up defining it. To get this
// circular dependence to work correctly in all situations we'll need to be
// sure to correctly apply the `--start-group` and `--end-group` options to
// GNU linkers, otherwise if we don't use any other symbol from the standard
// library it'll get discarded and the whole application won't link.
//
// In this loop we're calculating the `group_end`, after which crate to
// pass `--end-group` and `group_start`, before which crate to pass
// `--start-group`. We currently do this by passing `--end-group` after
// the first crate (when iterating backwards) that requires a lang item
// defined somewhere else. Once that's set then when we've defined all the
// necessary lang items we'll pass `--start-group`.
//
// Note that this isn't amazing logic for now but it should do the trick
// for the current implementation of the standard library.
let mut group_end = None;
let mut group_start = None;
// Crates available for linking thus far.
let mut available = FxHashSet::default();
// Crates required to satisfy dependencies discovered so far.
let mut required = FxHashSet::default();

let info = &codegen_results.crate_info;
for &(cnum, _) in deps.iter().rev() {
if let Some(missing) = info.missing_lang_items.get(&cnum) {
let missing_crates = missing.iter().map(|i| info.lang_item_to_crate.get(i).copied());
required.extend(missing_crates);
}

required.insert(Some(cnum));
available.insert(Some(cnum));

if required.len() > available.len() && group_end.is_none() {
group_end = Some(cnum);
}
if required.len() == available.len() && group_end.is_some() {
group_start = Some(cnum);
break;
}
}

// If we didn't end up filling in all lang items from upstream crates then
// we'll be filling it in with our crate. This probably means we're the
// standard library itself, so skip this for now.
if group_end.is_some() && group_start.is_none() {
group_end = None;
}

let mut compiler_builtins = None;

for &(cnum, _) in deps.iter() {
if group_start == Some(cnum) {
cmd.group_start();
}

// We may not pass all crates through to the linker. Some crates may
// appear statically in an existing dylib, meaning we'll pick up all the
// symbols from the dylib.
Expand All @@ -2192,10 +2150,6 @@ fn add_upstream_rust_crates<'a, B: ArchiveBuilder<'a>>(
}
Linkage::Dynamic => add_dynamic_crate(cmd, sess, &src.dylib.as_ref().unwrap().0),
}

if group_end == Some(cnum) {
cmd.group_end();
}
}

// compiler-builtins are always placed last to ensure that they're
Expand Down Expand Up @@ -2363,11 +2317,10 @@ fn add_upstream_native_libraries(
codegen_results: &CodegenResults,
crate_type: CrateType,
) {
// Be sure to use a topological sorting of crates because there may be
// interdependencies between native libraries. When passing -nodefaultlibs,
// for example, almost all native libraries depend on libc, so we have to
// make sure that's all the way at the right (liblibc is near the base of
// the dependency chain).
// We put all static libraries in a linker group so that later libraries may
// depend on earlier ones, but we still use a topological sorting of crates
// because that preserves expected symbol resolution ordering between
// libraries even if multiple libraries may provide the same symbol.
//
// This passes RequireStatic, but the actual requirement doesn't matter,
// we're just getting an ordering of crate numbers, we're not worried about
Expand Down
16 changes: 0 additions & 16 deletions compiler/rustc_codegen_ssa/src/base.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ use rustc_index::vec::Idx;
use rustc_middle::middle::codegen_fn_attrs::CodegenFnAttrs;
use rustc_middle::middle::cstore::EncodedMetadata;
use rustc_middle::middle::cstore::{self, LinkagePreference};
use rustc_middle::middle::lang_items;
use rustc_middle::mir::mono::{CodegenUnit, CodegenUnitNameBuilder, MonoItem};
use rustc_middle::ty::layout::{HasTyCtxt, TyAndLayout};
use rustc_middle::ty::layout::{FAT_PTR_ADDR, FAT_PTR_EXTRA};
Expand Down Expand Up @@ -782,20 +781,16 @@ impl CrateInfo {
used_crates_dynamic: cstore::used_crates(tcx, LinkagePreference::RequireDynamic),
used_crates_static: cstore::used_crates(tcx, LinkagePreference::RequireStatic),
used_crate_source: Default::default(),
lang_item_to_crate: Default::default(),
missing_lang_items: Default::default(),
dependency_formats: tcx.dependency_formats(()),
windows_subsystem,
};
let lang_items = tcx.lang_items();

let crates = tcx.crates();

let n_crates = crates.len();
info.native_libraries.reserve(n_crates);
info.crate_name.reserve(n_crates);
info.used_crate_source.reserve(n_crates);
info.missing_lang_items.reserve(n_crates);

for &cnum in crates.iter() {
info.native_libraries
Expand All @@ -814,17 +809,6 @@ impl CrateInfo {
if tcx.is_no_builtins(cnum) {
info.is_no_builtins.insert(cnum);
}
let missing = tcx.missing_lang_items(cnum);
for &item in missing.iter() {
if let Ok(id) = lang_items.require(item) {
info.lang_item_to_crate.insert(item, id.krate);
}
}

// No need to look for lang items that don't actually need to exist.
let missing =
missing.iter().cloned().filter(|&l| lang_items::required(tcx, l)).collect();
info.missing_lang_items.insert(cnum, missing);
}

info
Expand Down
3 changes: 0 additions & 3 deletions compiler/rustc_codegen_ssa/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ use rustc_ast as ast;
use rustc_data_structures::fx::{FxHashMap, FxHashSet};
use rustc_data_structures::sync::Lrc;
use rustc_hir::def_id::CrateNum;
use rustc_hir::LangItem;
use rustc_middle::dep_graph::WorkProduct;
use rustc_middle::middle::cstore::{self, CrateSource, LibSource};
use rustc_middle::middle::dependency_format::Dependencies;
Expand Down Expand Up @@ -146,8 +145,6 @@ pub struct CrateInfo {
pub used_crate_source: FxHashMap<CrateNum, Lrc<CrateSource>>,
pub used_crates_static: Vec<(CrateNum, LibSource)>,
pub used_crates_dynamic: Vec<(CrateNum, LibSource)>,
pub lang_item_to_crate: FxHashMap<LangItem, CrateNum>,
pub missing_lang_items: FxHashMap<CrateNum, Vec<LangItem>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💖

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be interested in getting rid of this without wrapping all of Rust crates into a group.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only way to do that would be to remove the concept of weak lang items. (lang items that don't have to be defined yet when using them) We can't do this however as #[panic_handler] is a weak lang item.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only way to do that would be to remove the concept of weak lang items.

I meant something like emitting -u my_weak_item for each such item to keep it around.
In other words, specifying the back-dependencies explicitly instead of "giving up" and resorting to groups.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I didn't know that was possible. Seems that ld has --require-defined= which is like -u except that the symbol has to be defined in one of the input objects and can't be left undefined in the output object.

Copy link
Member

@bjorn3 bjorn3 Jul 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only ld.bfd has --require-defined. ld.gold only has -u just like some other Unix linkers it seems. macOS's ld seems to have -u be equivalent to --require-defined and instead uses -U where other linkers use -u. link.exe and lld don't need linker groups.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have symbols.o (#95604) it may be an even better way to do stuff like this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion is now implemented in #100404.

pub dependency_formats: Lrc<Dependencies>,
pub windows_subsystem: Option<String>,
}
Expand Down