-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mix feature flags into fingerprint/metadata shorthash #3102
Conversation
Thanks for the PR! Wanted to let you know that I hadn't forgotten about this and I've just been thinking about it recently. It looks like a test is currently failing but I think it's just because it's expecting a rebuild when one isn't actually happening, so not a hard failure. That to me though is proof enough that this works :) Could you add a test though which actually runs the output to ensure everything gets properly linked? (just to make sure we don't accidentally use stale artifacts). I think that this may start inflating the size of build directories over time a bit, but otherwise this seems pretty reasonable to me. I admit that I've run into this problem multiple times before as well and it seems nice to solve it once and for all. |
I also wanted to add a couple of tests for this case as well, so I'll take care of that and holler back when updated. |
Ok. I'm not confident this is quite doing the right thing for binaries. See my comments on the diff. Looking for some assistance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm misunderstanding something about the tests and the suffix system (and perhaps the compilation process).
@@ -358,9 +366,18 @@ impl<'a, 'cfg> Context<'a, 'cfg> { | |||
}) | |||
} else { | |||
None | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left this code commented out to highlight a point.
I believe that when building a path dependency, this function returns None in order to not "pollute the suffix" of the executable or the library.
However, this means that when building across features, it appears to build to the same executable/library name.
It's correctly separating the .fingerprint, but not the actual built executable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm yes I can see how this would indeed be a problem! For executables we could generate ones with long filenames and them link them into a location without the metadata, but that does weird things with dynamic libraries unfortunately, so we can't do that everywhere.
Short of building everything in an entirely new directory, we may have to just stomach recompiling these top level dependencies whenever various flags change.
dir.join(format!("feat-{}", feat_vec.join("-"))) | ||
} else { | ||
dir | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correctly separating out the fingerprints, but not the executables.
The executables/libraries are correctly separated as long as they are a full dep, but when they are a path dep, I think this is not working right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is related to the commented out piece above. One option we could do is to build everything into target/debug/$hash/...
where $hash
represents feature flags etc. When a build finishes we'd then link everything into target/debug/...
. That'd allow multiple top-level executables with the same filename to exist in various locations in this cache at least.
[RUNNING] `[..]target/debug/b` | ||
")); | ||
|
||
/* Build a/ package again. If we cache different feature dep builds correctly, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm so this test is passing?
If so, I think that may be due to how Cargo tracks where if one target is rebuilt it will unconditionally rebuild all transitive dependants. You can perhaps try to expose this through:
cd a
cargo build -p dep_crate
cargo run
That'll force Cargo to try to understand that the dependency was rebuilt, maybe? (not sure if that'd fail)
Didn't forget about this! Just got a bit busy. On Fri, Oct 7, 2016 at 3:29 PM Alex Crichton [email protected]
|
Yeah if we do take this route I think we'll want to hash in those options as well. I actually just had a chat with @brson as well and we're feeling like we may as well go whole hog on this and hash everything to ensure that bins are distinct. For now that basically means that we'd hash basically an entire |
I went through and completed this PR.
Stuff I did not do, but could add in
Stuff I would appreciate some additional scrutiny
|
.chain_error(|| { | ||
human(format!("failed to link or copy `{}` to `{}`", | ||
src.display(), dst.display())) | ||
})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very interesting. This code is non-deterministically failing due to parallelism + the non-atomicity of the remove_file, hard_link, and fs_copy on the test which is compiling two versions of a crate (use_a_spec_to_select).
Here is the ordering that causes a failure.
thread 1) foo-abcdef -> foo
thread 2) foo-123456 -> foo
- Checks if foo exists [no]
- Checks if foo exists [no]
- hard links foo(inode x) -> foo-abcdef(inode x)
- Attempts to hard link foo -> foo-123456 (fails due to foo existing)
- Attempts to copy foo-123456 -> foo (succeeds, overwriting inode x and both versions of foo).
Now foo, foo-123456, and foo-abcdef all have the same contents.
We need to somehow prevent either
- Multiple versions of the same crate being compiled concurrently
- Multiple versions attempting hard link concurrently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps linking within the deps/ directory is not necessary and we could try to eliminate this behavior of having a deps/libfoo.rlib. I will try that out.
@@ -335,8 +365,7 @@ fn rustc(cx: &mut Context, unit: &Unit) -> CargoResult<Work> { | |||
if !src_dir.ends_with("deps") { | |||
continue | |||
} | |||
let dst = src_dir.parent().unwrap() | |||
.join(src.file_name().unwrap()); | |||
let dst = src_dir.parent().unwrap().join(bin_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might this code below (hard link when doing move_outputs_up) also be susceptible to the same bug? Perhaps not today, since we should only be moving up a top level package, but it seems like a fragility that might cause us to inadvertently add a bug in the future.
One possibility (striving for simplicity over performance) is to only copy and never link.
Ok. I sucked it up and got my windows VM up and running with a developer environment. I narrowed the test issue down to this. On windows, when producing a dylib with rustc, it produces four files in
My code modifies this to have
When setting as a dep for foo, it tries to link against builder.dll.lib (as seen here https://ci.appveyor.com/project/rust-lang-libs/cargo/build/1.0.44/job/8q6mclw54x5snn9j) Thus, it fails to link since the target/debug/ directory only has the .dll and not the .dll.lib Some possibilities, which I don't have enough context to select between =\
|
Perhaps we could just base the whole fingerprint on the target fingerprint? E.g. include the whole hash here.
Yeah I think this'll give us the most bang for our buck, if it's easy to do let's go ahead and do it.
Nah the strategy you outlined above seems fine to me. I'm a little worried about dynamic libraries but I'll read over this and see what's what.
Oh geez that's a bummer. I guess this confirms my dylib worries above as well! So OSX is also weird here where the dylib itself encodes the name it was compiled with. That way if you rename the dynamic library and link to the renamed copy, that binary won't work at runtime because it'll try to find the old name. In that sense it's probably easiest to just not suffix dylibs with anything and completely recompile them more often (due to these issues). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, looking good! I've left some comments here or there, but I think I'll need to really sit down to digest this soon (currently sitting in an airport).
If this is getting too complicated I also don't mind taking this over!
@@ -309,49 +309,46 @@ impl<'a, 'cfg> Context<'a, 'cfg> { | |||
|
|||
/// Get the metadata for a target in a specific profile | |||
pub fn target_metadata(&self, unit: &Unit) -> Option<Metadata> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function should always return Some
now, right? It may be useful to just return Metadata
here and then deal with the consequences elsewhere as they arise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
target_metadata returns Option. The only way to get None is if target is a dylib or perhaps if unit.target.metadata() returns None. I did not try to track down the latter case.
I think a lot of this code will simplify a lot if metadata was universally applicable. For example, a lot of the special casing of "unit.profile.test" would go away since that is part of the metadata hash.
unit.target.crate_name() | ||
}; | ||
if unit.profile.test { | ||
bin + "-test" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm I'm trying to recall why something like this would be required but I'm drawing a blank. Could you remind me why this is needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. This was needed to separate test binaries from true binaries. Previously they were separated by metadata hash. The move-targets-up code was linking them on top of each other in this code. One option was to replicate the -{metadata} behavior, but I thought it simpler to just do "-test" instead now that there was a clear way of separating them. I can maintain the existing behavior and separate out this behavioral change into another request.
@@ -82,8 +83,13 @@ pub fn prepare_target<'a, 'cfg>(cx: &mut Context<'a, 'cfg>, | |||
missing_outputs = !root.join(unit.target.crate_name()) | |||
.join("index.html").exists(); | |||
} else { | |||
for (filename, _) in try!(cx.target_filenames(unit)) { | |||
missing_outputs |= fs::metadata(root.join(filename)).is_err(); | |||
let move_outputs_up = unit.pkg.package_id() == &cx.current_package || unit.target.is_bin(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it may want to be extracted to a method on Context
?
@@ -1790,13 +1790,15 @@ fn example_bin_same_name() { | |||
.unwrap(); | |||
|
|||
assert_that(&p.bin("foo"), is_not(existing_file())); | |||
assert_that(&p.bin("foo-test"), existing_file()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm ok so this may be where the -test
is showing up.
This looks like a somewhat orthogonal change, however (having predictably named test artifacts). Perhaps we could leave that to a future PR?
Sure I can take another look at it. Comments seem to make sense. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a quick self-review. Let me know what you think.
// No metadata for dylibs because of a couple issues | ||
// - OSX encodes the dylib name in the executable | ||
// - Windows rustc multiple files of which we can't easily link all of them | ||
if !unit.profile.test && unit.target.is_dylib() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a finicky line. I would have thought dylib alone would be sufficient, but then "cargo build" and "cargo test" build to the same target which is not acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah there's tons of special cases here for unit.profile.test
, that's expected and ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah so one thing I just remembered though is that we don't want to unconditionally return None
here in the case of an upstream dylib. That is, if a transitive dependency is a dylib, we'll want metadata mixed in there to resolve name conflicts.
Additionally, the __CARGO_DEFAULT_LIB_METADATA
below I think still needs to be factored in here as well. That's used when building the standard library itself to ensure that we throw a hash in there even though it's a dylib and even though it's a local path dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the conditions we want? We want to check if the target is the current_package AND is a dylib?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to handle the upstream dylib case
I think the conditions we're going to want here are:
if !unit.profile.test &&
unit.target.is_dylib() &&
unit.pkg.package_id().source_id().is_path() &&
magical_env_var_not_set() {
return None
}
That is, we want metadata on all tests, we want metadata on everything that's not a dylib, we want metdata on all upstream deps (non-path deps), and finally we have this magical env var thing.
Note that "upstream" == "not a path dep" because the current package changes when you move around a workspace. A path dep is basically a heuristic, but it works fine for us.
So for the magical env var, the problem is that if we didn't have that clause in the condition then when we cargo build
the standard library we get a bunch of files that are literally libstd.so
, librustc.so
, etc. Now we don't want this because it means when you install Rust on your system you'd get /usr/lib/libstd.so
. That's a little too "official" for us to take! To solve this problem we've always added some form of hash to the libraries, so you'd instead install /usr/lib/libstd-xxxxxxxx.so
.
So basically when we use Cargo to build the standard library we want to unconditionally insert all these hashes, so the env var just says "please insert hashes everywhere".
Now if you're curious why we don't always install the hash, that's also interesting. The main idea is that if I have a project called "foo" and I built a dylib, we want to generate the predictable filename target/debug/libfoo.so
. (no hash). Now for this to actually work correctly, however, we can't take the same trick we're doing with staticlibs and executables where we just hard link the file to the correct name. Platforms like OSX use the generate file name (target/debug/deps/libfoo-xxxxxx.so
) at runtime for various reasons, so we need to generate the file as literally target/debug/deps/libfoo.so
for now. We can fix this with linker options but rustc doesn't expose that yet (and is a problem for another day!)
Does that all make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good. Based on that explanation, seems like we should also have a test for this env-var situation. I added a test for this.
@@ -309,49 +309,46 @@ impl<'a, 'cfg> Context<'a, 'cfg> { | |||
|
|||
/// Get the metadata for a target in a specific profile | |||
pub fn target_metadata(&self, unit: &Unit) -> Option<Metadata> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
target_metadata returns Option. The only way to get None is if target is a dylib or perhaps if unit.target.metadata() returns None. I did not try to track down the latter case.
I think a lot of this code will simplify a lot if metadata was universally applicable. For example, a lot of the special casing of "unit.profile.test" would go away since that is part of the metadata hash.
let mut metadata = unit.pkg.generate_metadata(); | ||
metadata.mix(&format!("bin-{}", unit.target.name())); | ||
Some(metadata) | ||
pkg_metadata.mix(&format!("bin-{}", unit.target.name())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The choice between pkg_metadata and metadata seems quite arbitrary to me and I'm not sure why there are two different constructs. This diff maintains the existing behavior, but I can't figure out why it is the way it is.
// just here for rustbuild. We need a more principled method of | ||
// doing this eventually. | ||
if unit.target.is_lib() { | ||
env::var("__CARGO_DEFAULT_LIB_METADATA").ok().map(|meta| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this env var is no longer necessary since we're always producing metadata on this path now. Where is it used from?
// it was compiled into something like `example/` or `doc/` then | ||
// we don't want to link it up. | ||
if src_dir.ends_with("deps") { | ||
// Don't lift up library dependencies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that this if/else is untested (if we hit the else every time, all the tests pass). Should we add a test for this condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that if all tests pass this is fine to just have the else
part. I think this came from way down below in cargo_rustc/mod.rs
but I can't quite recall exactly what this is doing. Although the doc block on this function may help me refresh my memory :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this if else prevents us from linking up dependent libs.
Eg if crate foo depends on something else (say petgraph)
This if else prevents us from linking target/debug/deps/petgraph-123456.rlib
to target/debug/petgraph.rlib
. There's no point in linking up deps, though it doesn't hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok, so the test would be asserting that target/debug/libpetgraph.rlib
doesn't exist. In that sense this behavior definitely looks correct to me, and if you want to add a test for that, sounds good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a couple lines to an existing test. No prob!
@@ -429,6 +476,7 @@ impl<'a, 'cfg> Context<'a, 'cfg> { | |||
support any of the output crate types", | |||
unit.pkg, self.target_triple()); | |||
} | |||
info!("Target filenames: {:?}", ret); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found these info's to be helpful. Let me know if you'd rather leave it in or dump them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -1790,13 +1790,15 @@ fn example_bin_same_name() { | |||
.unwrap(); | |||
|
|||
assert_that(&p.bin("foo"), is_not(existing_file())); | |||
// We expect a file of the form bin/foo-{metadata_hash} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, it's hard for us to confirm that such a file exists from our tests. We just confirm that the original "foo" doesn't exist. This was the original motivation for the "foo-test" change, but I left this comment here instead.
We could also provide a way to programatically report the metadata or the test binary name to the caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah eventually we'll definitely publish the names of these files via --message-format=json
(ala #3212)
Hey @alexcrichton. I noticed that the appveyor is getting backed up for hours. Enabling this feature may help reduce the size of the queue when something is under rapid development. Alternately, could upgrade the appveyor account for additional parallelism, but honestly, rolling builds will probably be good enough. https://www.appveyor.com/docs/build-configuration/#rolling-builds |
})); | ||
} | ||
try!(fs::hard_link(&src, &dst) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to pop this linking step out into a separate task? I'm worried that the task won't re-run under all conditions. For example, if there are two binaries built with different features, it would compile to
deps/bin-abc123
deps/bin-456789
If you then switched back to using the first binary, the fingerprint would indicate that you're all set. Since the linking is happening as part of the rustc stage, there's no way to re-link without recompiling. Any suggestion on how to get around that?
For now, I will put together a test that fails this condition and see if I can get some code to trigger a recompile of the top level package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, a very good point! Definitely something we'll want to handle. This makes me vaguely wonder how this works today...
IIRC, each unit of work always has something run. There's a unit of work to run if it's dirty and a different one if it's "fresh" (doesn't need to be rerun). The thinking is that "fresh" work is way faster than dirty work.
So I think we can shove this into also happening during the fresh stages to cover this use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works today because the fingerprint filename is the same for both deps. Thus if you change the features and try to recompile, it will trigger as dirty. I changed it to have different fingerprint filename for each of these variants so when switching back and forth we keep it working right.
☔ The latest upstream changes (presumably #3257) made this pull request unmergeable. Please resolve the merge conflicts. |
Since building dependencies results in different libraries depending on the feature flags, I added the feature flags into the short_hash. This solves an issue when multiple crates share a target directory or multiple targets share a common library with divergent feature flag choice. - Only link binaries, build scripts, and top level deps - Handle dylibs differently (no metadata filename / linking) - Fingerprint based on link_dst rather than file_stem. This (sadly) limits the effects of dep caching to things which don't produce a hard-link. Currently, this is only dependent library crates. Stil a big win.
FWIW this has ended up becoming quite the large diff. It could feasibly be broken into two parts
If you think it is worthwhile, I'm happy to do it, since it will definitely shorten the diffs. Just want to make sure part 1 is in line with your view before I go for it. |
📌 Commit d7977f8 has been approved by |
⌛ Testing commit d7977f8 with merge 1cf3cba... |
💔 Test failed - cargo-win-gnu-32 |
#[test] | ||
#[cfg_attr(target_env = "msvc", ignore)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I guess this needs to be windows
, not just MSVC
That seems like it'll cause you to lose test coverage on windows for some code that's specific to windows. But here you go! |
Ah that's ok, as soon as we ditch buildbot I'll re-enable the test |
@bors: r+ |
📌 Commit 2af9e1f has been approved by |
⌛ Testing commit 2af9e1f with merge e82ad8a... |
💔 Test failed - cargo-win-msvc-32 |
Whoops. Said target_env = windows instead of target_os = windows. Fixing. |
@bors: r+ |
📌 Commit 8e22eca has been approved by |
⌛ Testing commit 8e22eca with merge 1877f59... |
Mix feature flags into fingerprint/metadata shorthash Since building dependencies results in different libraries depending on the feature flags, I added the feature flags into the short_hash. This solves an issue when multiple crates share a target directory or multiple targets share a common library with divergent feature flag choice. I'm not sure if this architecturally the best way to solve this problem, but I did confirm that this fixes the issue I was seeing. I can also add a test for this case if this code is taking the right approach (or if it would help illustrate the issue).
☀️ Test successful - cargo-cross-linux, cargo-linux-32, cargo-linux-64, cargo-mac-32, cargo-mac-64, cargo-win-gnu-32, cargo-win-gnu-64, cargo-win-msvc-32, cargo-win-msvc-64 |
AFAICT, we do not test on these older platforms anymore. Regardless, the test seems to work fine on 32-bit windows-gnu on Windows 10. See rust-lang#3102 (comment) where it was originally disabled.
AFAICT, we do not test on these older platforms anymore. Regardless, the test seems to work fine on 32-bit windows-gnu on Windows 10. See rust-lang#3102 (comment) where it was originally disabled.
AFAICT, we do not test on these older platforms anymore. Regardless, the test seems to work fine on 32-bit windows-gnu on Windows 10. See rust-lang#3102 (comment) where it was originally disabled.
Since building dependencies results in different libraries
depending on the feature flags, I added the feature flags into
the short_hash.
This solves an issue when multiple crates share a target directory
or multiple targets share a common library with divergent feature
flag choice.
I'm not sure if this architecturally the best way to solve this problem, but I did confirm that this fixes the issue I was seeing. I can also add a test for this case if this code is taking the right approach (or if it would help illustrate the issue).