Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix freshness when linking is interrupted. #8087

Merged
merged 3 commits into from
Apr 10, 2020

Conversation

ehuss
Copy link
Contributor

@ehuss ehuss commented Apr 8, 2020

Fixes a scenario where hitting Ctrl-C while linking would leave a corrupted executable, but Cargo would think it is "fresh" and fail to rebuild it.

This also includes a separate commit which adds more documentation on fingerprinting.

Fixes #7767

@rust-highfive
Copy link

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 8, 2020
@ehuss
Copy link
Contributor Author

ehuss commented Apr 8, 2020

Truncating the file seems a little hacky, but it seemed like a reasonable way to invalidate the fingerprint. I don't want to delete it because that would break the compare debug logging in some cases.

@bors
Copy link
Collaborator

bors commented Apr 9, 2020

☔ The latest upstream changes (presumably #8073) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the docs! I've got a question about the scenario this is fixing, but in general looks great to me

// Interrupt the child.
child.kill().unwrap();
// Note: rustc and the linker are still running, let them exit here.
conn.write(b"X").unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically it should be sufficient to drop the conn here rather than writing to it, but either way is fine.

// up-to-date until after a successful build.
if loc.exists() {
paths::write(&loc, b"")?;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to make sure I fully understand this. So today we consider step 4 fresh, but it should actually rerun because the linker didn't actually finish successfully. When linking is interrupted the compiler has already emitted the dep-info file, so when we finish the unit and wrap up the fingerprint stuff it looks like it succeeded because the output file exists and the dep-info exists? Is that right?

This feels like it may be best handled by we truncate the file if the process failed for whatever reason. I'm a little hazy on this though. Is the idea that this truncation happens in step 3, which when interrupted, means step 4 picks up the truncated file?

Copy link
Contributor Author

@ehuss ehuss Apr 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and the dep-info exists? Is that right?

I don't think it matters whether or not rustc emitted the .d file. Cargo doesn't translate it into the fingerprint directory unless the compilation finished successfully.

The issue is that there is a valid fingerprint from the previous run (step 1). During step 3, the new unit integration test executable is created but not completed. In step 4, all components of the unit integration test fingerprint are fresh from step 1 (nothing changed), and the mtime comparison of the incomplete executable is newer than the library's rlib (and all the unit integration test source files), so it thinks it is fresh.

truncate the file if the process failed

I don't think we can only truncate on failure, because Cargo is forcefully terminated. (I guess we could catch signals but that doesn't help with SIGKILL, and I'd prefer not to deal with signals.)

Is the idea that this truncation happens in step 3, which when interrupted, means step 4 picks up the truncated file?

Yes, step 4 compares an empty string with the expected hash string, and determines it is dirty because they are not equal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man see this is when I have mega-doubts about everything. Doing anything with mtimes is so tricky that it seems like a folly almost. In any case agreed we shouldn't catch signals (was misinterpreting this for a second as failed rustc vs interrupted cargo).

I think the way we should think about this is that when we start a compilation we delete the expected outputs. That way if something bad happens, the next time we pick up the outputs won't exist and we'll see we need to recompile. This only works if the outputs are Cargo-created, not well-known files (like the linked artifact). The dep-info here, consequently, is a Cargo-output which only Cargo creates, so it makes sense to delete it before a build and create it after a successful build.

Would it perhaps make more sense to delete the file here rather than truncating it to zero bytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it truncates instead of deleting is that the fingerprint debug logging doesn't work in the scenario:

  1. Successful build.
  2. Change something.
  3. Build with a compile error. Logs fingerprint reason. Delete fingerprint.
  4. Build again. Fingerprint log just says "file not found" with no other detail.

I wanted to make sure step 4 continued to log the real reason for the fingerprint error (like "current filesystem status shows we're outdated") rather than "file not found".

I don't think it is too important (the first build failure logs the real reason), so I can switch it to delete if you prefer. I'm curious what concerns you have about truncating, though?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah that makes sense to me, so I think truncation here is fine. Could you update the comment with some of this discussion though? (about what files are flowing into what steps and why we're truncating instead of deleting).

I'm slightly confused again though. I would have thought that step 4 can't read the fingerprint file of step 1 (without this patch) because the mtime of the input files (from step 2) shows that the fingerprint file is stale. The mtime of the file from step 2 would be later than that of the fingerprint file from step 1, right?

(or do we write out the fingerprint file again in step 3?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking mtime is done at the same time as checking the fingerprint hash here. So in a sense it checks both the fingerprint hash and the fs_status at the same time. If either of them fail, it falls through to the code below to log the failure.

If it deleted the file, the first line in compare_old_fingerprint would fail, circumventing the logging.

(Fingerprints are only written after a successful build.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er sorry I was roundaboutedly asking a question on your response above where you said

The issue is that there is a valid fingerprint from the previous run (step 1). During step 3, the new unit test executable is created but not completed. In step 4, all components of the unit test fingerprint are fresh from step 1 (nothing changed), and the mtime comparison of the incomplete executable is newer than the library's rlib (and all the unit test source files), so it thinks it is fresh.

You say "all components of the unit test fingerprint are fresh from step 1" but I thought that the final fingerprint file that Cargo writes for the executable woul dnot be fresh, right? The executable is there (and corrupt) but a source file is still newer than the final fingerprint file because we don't write that until rustc finishes I thought?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I made a typo in my reply. Where I said "unit test" I meant "integration test". Going back to the original example, "lib.rs" is changed, and compiles successfully. Then Cargo goes to build the integration test executable, and is interrupted. The next time around (step 4) "lib.rs" is up-to-date (and correct on-disk). And none of the integration test sources have changed, so that check appears up-to-date as well.

That was a bit of a subtle point, I'll emphasize that in the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit with extra details. Hopefully it's not too confusing, I tried to be more precise with the wording.

@alexcrichton
Copy link
Member

@bors: r+

Ah ok that was a key part I was missing. This workaround isn't necessary for a one-unit compilation, but it's required for multi-unit compilations where you don't change the source of previous compilations. This happens because we don't hash the contents of each output, we just rely on mtimes. In any case looks good to me :)

@bors
Copy link
Collaborator

bors commented Apr 10, 2020

📌 Commit 14e86cc has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 10, 2020
@bors
Copy link
Collaborator

bors commented Apr 10, 2020

⌛ Testing commit 14e86cc with merge 53b1c48...

@bors
Copy link
Collaborator

bors commented Apr 10, 2020

☀️ Test successful - checks-azure
Approved by: alexcrichton
Pushing 53b1c48 to master...

@bors bors merged commit 53b1c48 into rust-lang:master Apr 10, 2020
@bors bors mentioned this pull request Apr 10, 2020
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 14, 2020
Update cargo

12 commits in 390e8f245ef2cd7ac698b8a76abf029f9abcab0d..74e3a7d5b756d7c0e94399fc29fcd154e792c22a
2020-04-07 17:46:45 +0000 to 2020-04-13 20:41:52 +0000
- Update dependencies to support illumos target (rust-lang/cargo#8093)
- Whitelist another known spurious curl error (rust-lang/cargo#8102)
- Fix nightly test matching rustc "warning" output. (rust-lang/cargo#8098)
- Update default for codegen-units. (rust-lang/cargo#8096)
- Fix freshness when linking is interrupted. (rust-lang/cargo#8087)
- Add `cargo tree` command. (rust-lang/cargo#8062)
- Add "build-finished" JSON message. (rust-lang/cargo#8069)
- Extend -Zpackage-features with more capabilities. (rust-lang/cargo#8074)
- Disallow invalid dependency names through crate renaming (rust-lang/cargo#8090)
- Use the same filename hash for pre-release channels. (rust-lang/cargo#8073)
- Index the commands section (rust-lang/cargo#8081)
- Upgrade to mdBook v0.3.7 (rust-lang/cargo#8083)
@ehuss ehuss added this to the 1.44.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Killing cargo test midway can leave target in unstable stage
4 participants