Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge commits break LLVM CI download #101907

Open
jachris opened this issue Sep 16, 2022 · 23 comments · Fixed by #113588 or #115409
Open

Merge commits break LLVM CI download #101907

jachris opened this issue Sep 16, 2022 · 23 comments · Fixed by #113588 or #115409
Labels
C-bug Category: This is a bug. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)

Comments

@jachris
Copy link
Contributor

jachris commented Sep 16, 2022

While PRs are not allowed to have merge commits, they can still be useful locally (or for not rewriting history for longer-running branches).

However, they break the commit detection logic for downloading CI artifacts.

Steps to Reproduce

  1. git checkout <old-commit>
  2. git merge --no-ff master
  3. ./x.py build library/std

Expected Result

Builds fine.

Actual Result

...
downloading https://ci-artifacts.rust-lang.org/rustc-builds/<old-commit>/rust-dev-nightly-x86_64-unknown-linux-gnu.tar.xz
curl: (22) The requested URL returned error: 404      #                                                                               

error: failed to download llvm from ci

help: old builds get deleted after a certain time
help: if trying to compile an old commit of rustc, disable `download-ci-llvm` in config.toml:

[llvm]
download-ci-llvm = false

Build completed unsuccessfully in 0:01:43
@jachris jachris added the C-bug Category: This is a bug. label Sep 16, 2022
@Mark-Simulacrum
Copy link
Member

Can you say more about their usefulness? What are you merging in? We can maybe support merging non-bors commits, though it brings some complexity (e.g., which parent do we follow for the "nearest" bors commit, and what does nearest mean at that point), so in general I would be inclined to say this isn't supported and you should build locally.

@jachris
Copy link
Contributor Author

jachris commented Sep 16, 2022

I don't think it's a must-have. I'm personally fine with building locally or rebasing. But this would allow longer-running branches to use the CI artifacts without rewriting history due to rebasing. Also, the current behavior can be a little bit confusing if one is not aware of this issue.

@RalfJung
Copy link
Member

RalfJung commented Oct 6, 2022

Also, the current behavior can be a little bit confusing if one is not aware of this issue.

Yeah it is confusing. :D Could you explain what happens?
Is this about having to detect the "latest commit from rustc master that is in this branch", so that we can download the right LLVM, and that fails?

@jachris
Copy link
Contributor Author

jachris commented Oct 6, 2022

From what I can tell, yes. The logic that determines that commit seems to not account for merge commits. But @jyn514, who helped me troublshoot this issue initially, might know the details.

@RalfJung RalfJung changed the title Merge commits break CI download Merge commits break LLVM CI download Oct 22, 2022
@RalfJung
Copy link
Member

I now also ran into this because syncing Miri changes back into rustc creates merge commits. Not sure how to work around this problem; rebasing is not an option for these syncs and a local build takes forever...

Cc @jyn514

@RalfJung
Copy link
Member

Specifically that Miri history contains commits made by bors in Miri, which then fools the detection logic to think that was a rustc bors commit.

@RalfJung
Copy link
Member

Currently, bootstrap uses the following command to determine the commit to fetch LLVM for

git rev-list [email protected] -n1 --first-parent HEAD -- src/llvm-project src/bootstrap/download-ci-llvm-stamp src/version

I can see two ways of making things work even for the Miri sync situation:

  • Instead of -n1, use -n10 or so and try several commits. For the case I have right now, the 2nd commit should already work.
  • Instead of HEAD, use $(git merge-base HEAD origin/master). However we can't really know what the origin remote is called, so it'd be more like
git fetch https://github.com/rust-lang/rust master
git merge-base HEAD FETCH_HEAD

@jachris
Copy link
Contributor Author

jachris commented Oct 22, 2022

I think the second option would be a solid solution, and it also works for the usecase I had in mind.

(Side note: Though I'm sure why this uses --first-parent. If we are on master, there should be no merge commits, and this has no effect. If we are on another branch, it discards merges into that branch... which is basically the original issue.)

@RalfJung
Copy link
Member

On master there are tons of merge commits -- the ones created by bors. In fact the commit we are looking for is always a merge commit, so maybe we should add --merges.

@jachris
Copy link
Contributor Author

jachris commented Oct 22, 2022

Ah yes, right, of course. The "no merge-commit policy" is only for PRs... Nevermind.

@Mark-Simulacrum
Copy link
Member

Fetching introduces a network call on each x.py invocation, I think, since we need to know if something has changed. I guess we could try to cache based on the head sha, but I'm not generally a fan of that.

I think the right answer is probably to change bors to include a repository in the commits it generates more explicitly - perhaps the email could be [email protected], for example.

Would need some care to update all of our automation, but seems achievable.

@RalfJung
Copy link
Member

Yeah I agree having to invoke the network is not great. In principle we could search the user's remotes for one that points to our github repo and use $remote/master... but that seems pretty fragile.

Having different bors email addresses for different repos would help for the Miri case, yeah. And I guess for @jachris we could replace the --first-parent by --merges; the email filtering should be enough to trim down the candidates.

@Mark-Simulacrum I take it you are also not a fan of trying multiple commit candidates? I guess that would require caching negative lookups so it's also not great.

A temporary hack that would be helpful is some way to overwrite the HEAD commit from which it starts searching. For the Miri PR I just locally patched bootstrap to be able to do x.py check which I needed to update the lockfile.

@Mark-Simulacrum
Copy link
Member

Mark-Simulacrum commented Oct 22, 2022

Updating the lockfile should be cargo metadata >/dev/null or something like that, no need for x.py.

Yeah, multiple candidates I think has similar problems in terms of network hits (or negative cache of some kind).

--merges seems fine to me, but I'm not sure we can do away with --first-parent - for example, in the case of rollups, we need to be sure we're looking at the first parent commit to traverse the commit graph. I guess with -n1 that might not matter but it feels iffy to me.

Can you point me at the SHA/branch for the miri merges that are causing trouble? I wonder if the merge commits might be distinguishable by (for example) not having a tree associated with them that contains src/stage0.json or something like that.

@RalfJung
Copy link
Member

Yeah you can just check out the branch for #103392 to see the problem.

I wonder if the merge commits might be distinguishable by (for example) not having a tree associated with them that contains src/stage0.json or something like that.

No that wouldn't help... this is Miri history reconstructed into the rust repo, it also has all of the rest of the rust repo around.

Updating the lockfile should be cargo metadata >/dev/null or something like that, no need for x.py.

Ah, I never even tried doing anything outside of x.py in the rustc repo, it seemed hopeless. But anyway sometimes I might also have to actually build that branch.

@Mark-Simulacrum
Copy link
Member

Yeah, I see. OK, I think the best option then is going to be the adjustment to bors to highlight which repository the merge commit is from. We'll want to check against triagebot, crater, rustc-perf, cargo-bisect-rustc, and the promote-release infrastructure which I think are the infrastructure places that have some amount of processing of rustc commit history.

@jyn514 jyn514 added the T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) label Apr 27, 2023
@jyn514
Copy link
Member

jyn514 commented Apr 27, 2023

We don't need to fetch to support remotes named something other than origin, we can reuse the logic in

/// Finds the remote for rust-lang/rust.
/// For example for these remotes it will return `upstream`.
/// ```text
/// origin https://github.com/Nilstrieb/rust.git (fetch)
/// origin https://github.com/Nilstrieb/rust.git (push)
/// upstream https://github.com/rust-lang/rust (fetch)
/// upstream https://github.com/rust-lang/rust (push)
/// ```
pub fn get_rust_lang_rust_remote(git_dir: Option<&Path>) -> Result<String, String> {

  • $(git merge-base HEAD origin/master)

👍 this seems like a good solution

@jyn514 jyn514 added the E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. label Apr 27, 2023
@RalfJung
Copy link
Member

RalfJung commented May 11, 2023

This problem still regularly hits me when I synchronize Miri changes.

An LLVM_FETCH_COMMIT env var hack would help a lot to bridge the gap until a proper solution emerges.

EDIT: For the record, here's a hacky patch that helps:

diff --git a/src/bootstrap/download.rs b/src/bootstrap/download.rs
index 3e82a381a1b..7600365d29b 100644
--- a/src/bootstrap/download.rs
+++ b/src/bootstrap/download.rs
@@ -568,6 +568,7 @@ pub(crate) fn maybe_download_ci_llvm(&self) {
         let llvm_root = self.ci_llvm_root();
         let llvm_stamp = llvm_root.join(".llvm-stamp");
         let llvm_sha = detect_llvm_sha(&self, self.rust_info.is_managed_git_subrepository());
+        let llvm_sha = "PUT_SHA_HERE"; // output of `git merge-base HEAD origin/master` goes here
         let key = format!("{}{}", llvm_sha, self.llvm_assertions);
         if program_out_of_date(&llvm_stamp, &key) && !self.dry_run() {
             self.download_ci_llvm(&llvm_sha);

@RalfJung
Copy link
Member

RalfJung commented Aug 2, 2023

Reopened by #114345. I hope the perf team can help figure out what is going on; just calling git merge-base shouldn't cause any side-effects.

@RalfJung RalfJung reopened this Aug 2, 2023
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 2, 2023
Revert rust-lang#113588 to fix bootstrap timings

This reverts rust-lang#113588 which seems to have broken perf's bootstrap timings via some git issue

rust-lang#114318 (comment) show a newly broken benchmark, the error at the time was

```
fatal: Path 'src/ci/channel' exists on disk, but not in 'e62323df22ecf9c163023132d17b7114f68b72e8'.
       thread 'main' panicked at 'command did not execute successfully: cd "/home/collector/rustc-perf/rust" && "git" "show" "e62323df22ecf9c163023132d17b7114f68b72e8:src/ci/channel"
       expected success, got: exit status: 128', config.rs:1786:27
```

If this lands, it will reopen rust-lang#101907 and annoy miri, but it could actually be an issue that would appear during the next bootstrap bump, not just rustc-perf today.

r? `@ghost`
@bors bors closed this as completed in 585bb5e Sep 2, 2023
RalfJung pushed a commit to RalfJung/miri that referenced this issue Sep 3, 2023
bootstrap: use git merge-base for LLVM CI download logic

This re-lands rust-lang/rust#113588, now that the perf issues are hopefully fixed by rust-lang/rustc-perf#1684.
r? `@lqd` `@Mark-Simulacrum`

Fixes rust-lang/rust#101907
@RalfJung RalfJung reopened this Jan 4, 2024
@RalfJung
Copy link
Member

RalfJung commented Jan 4, 2024

This regressed again. :(

downloading https://ci-artifacts.rust-lang.org/rustc-builds/23efae075aaf57714e8403572f151491869273fb/rust-dev-nightly-x86_64-unknown-linux-gnu.tar.xz
curl: (22) The requested URL returned error: 404

ERROR: failed to download llvm from ci

    HELP: old builds get deleted after a certain time
    HELP: if trying to compile an old commit of rustc, disable `download-ci-llvm` in config.toml:

    [llvm]
    download-ci-llvm = false

@RalfJung
Copy link
Member

RalfJung commented Jan 4, 2024

It's picking up some bors commit from the Miri repo, I think.

This will be near impossible to fix in rustc, it needs bors changes: rust-lang/homu#200.

@Swatinem
Copy link
Contributor

I was also hit by this since quite a long time and was resorting to building LLVM from source.

Digging a bit deeper and dbg!-printing a couple of git commands, I stumbled across this comment here:

/// Returns the master branch from which we can take diffs to see changes.
/// This will usually be rust-lang/rust master, but sometimes this might not exist.
/// This could be because the user is updating their forked master branch using the GitHub UI
/// and therefore doesn't need an upstream master branch checked out.

As indeed, the detection logic was using my local upstream/master branch which I haven’t touched since 2021 in fact, as I was instead using the GitHub UI to keep my own origin/master in sync.

I wonder if its possible to detect this somehow that the rust-lang/rust remote is hopelessly outdated?

@RalfJung
Copy link
Member

Ah, that's annoying that Github guides users towards circumventing the normal Git workflows. :/

We could maybe detect when the detected upstream/master is more than a week old an run git fetch?

Though if bors was just putting a bit more information into the merge commit, none of this would be needed... these are all just hacks to work around not knowing which bors commits come from which repository.

@RalfJung
Copy link
Member

RalfJung commented Nov 9, 2024

Now that subrepos are not using bors any more, this should be much less of an issue -- the logic that searches for a "merge commit by bors" should no longer be confused by bors merge commits that were actually merging a Miri/clippy/... PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)
Projects
None yet
5 participants