Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack broken by latest integer-gmp #3624

Closed
TravisWhitaker opened this issue Dec 4, 2017 · 20 comments
Closed

Stack broken by latest integer-gmp #3624

TravisWhitaker opened this issue Dec 4, 2017 · 20 comments

Comments

@TravisWhitaker
Copy link

TravisWhitaker commented Dec 4, 2017

A few hours ago integer-gmp-1.0.1.0 was uploaded to Hackage. The new package's cabal file's build-depends field uses the new ^>= syntax for version specification. The stable release (v1.5.1) doesn't use Cabal 2, so if you're working with a fresh index stack (any operation but stack update, it seems) will fail with Unable to parse cabal file for integer-gmp-1.0.1.0: NoParse "build-depends" 58. This seems to occur in all of my stack projects, no matter which snapshot is selected.

To reproduce (in some stack project's directory):

some/stack/project $ cp -r ~/.stack ~/.stack_known_good
some/stack/project $ rm -r ~/.stack
some/stack/project $ stack build
Downloaded nightly-2017-11-24 build plan.
Selected mirror https://s3.amazonaws.com/hackage.fpcomplete.com/
Downloading root
Selected mirror https://s3.amazonaws.com/hackage.fpcomplete.com/
Downloading timestamp
Downloading snapshot
Downloading mirrors
Cannot update index (no local copy)
Downloading index
Updated package list downloaded
Populated index cache.
Unable to parse cabal file for integer-gmp-1.0.1.0: NoParse "build-depends" 58

I'll note that all of the stack projects I've tried this with have non-empty extra-deps fields; perhaps stack functions correctly without checking the Hackage index if there are no extra-deps specified? Seems this happens even without any extra-deps.

If I've got something wrong here or if someone is aware of a workaround that isn't copying another machine's index, please do share.

@TravisWhitaker
Copy link
Author

#3464 mentions that Cabal 2 syntax is not allowed in Stackage packages. However, integer-gmp-1.0.1.0 appears in the 12/01 nightly snapshot. Either integer-gmp-1.0.1.0 was re-uploaded, or some manual adjustment or exception was made when this version was added to Stackage.

@snoyberg
Copy link
Contributor

snoyberg commented Dec 5, 2017

The problem is that integer-gmp is included with GHC itself. This has a long history, quick summary:

In other words, this will be fixed permanently with the upcoming release.

@TravisWhitaker
Copy link
Author

TravisWhitaker commented Dec 5, 2017 via email

@snoyberg
Copy link
Contributor

snoyberg commented Dec 5, 2017

Maybe I misunderstood. I thought this would only cause trouble if you try using a newer Stackage Nightly snapshot. Are you saying that Stack is broken for all cases, even if you aren't using a GHC 8.2.2 snapshot?

@TravisWhitaker
Copy link
Author

Ah, it seems if you're on a snapshot with GHC 8.0.2 or earlier things are fine (makes sense seeing #3396); a colleague reported this breakage with LTS 9 but I wasn't able to reproduce it just now.

I do get this issue with the 8.2.1 nightlies (11/24 and previous). If integer-gmp-1.0.1.0 is meant for GHC 8.2.2, it's strange that it appears in the 11/24 snapshot, but perhaps I'm missing something.

If I understand correctly, the Cabal 2 syntax will make stack 1.5.1 effectively unusable with the GHC 8.2.* snapshots, is that correct? What a mess this whole ^>= business seems to have caused.

@snoyberg
Copy link
Contributor

snoyberg commented Dec 5, 2017

I'm confused about integer-gmp-1.0.1.0, I don't know how it would have affected the 8.2.1 nightlies without further research. But the basic idea is: if a package in a snapshot takes advantage of a new Cabal feature, then older Stacks will have no way of using it. We can try to work around it by blocking such packages in Stackage, but we can't do anything about packages shipped with GHC itself.

I've requested that some time be taken to allow tooling to upgrade before we start using new features, but that request was denied.

What a mess this whole ^>= business seems to have caused.

You don't know the half of it :( For example, see #3464.

@tfausak
Copy link
Contributor

tfausak commented Dec 6, 2017

I just ran into this problem. I'm on Windows using Stack 1.5.1 and the nightly 2017-12-01 resolver. I understand that integer-gmp is at fault for using the new ^>= bounds, but I can't find a bug tracker for that package. I worked around this by switching to the Stack 1.6.1 release candidate, which handled this perfectly.

@austinvhuang
Copy link

austinvhuang commented Dec 6, 2017

We're hitting this problem with integer-gmp-1.0.1.0 in our CI as well. As with the OP we also seem to observe this even when reverting to the 11/24 8.2.1 nightly (which had previously been working).

@tfausak
Copy link
Contributor

tfausak commented Dec 6, 2017

I created a ticket on GHC's Trac for this: https://ghc.haskell.org/trac/ghc/ticket/14558

@steshaw
Copy link

steshaw commented Dec 6, 2017

How do I subscribe to the GHC#14558 over on GHC Trac?

@snoyberg
Copy link
Contributor

snoyberg commented Dec 6, 2017

@steshaw Click on "Modify Ticket" and add your name to the "CC" field, then click "Submit changes." And for the record: I find that workflow really confusing :)

@steshaw
Copy link

steshaw commented Dec 6, 2017

Thought I'd pick up on @TravisWhitaker's note about broken reproducibility. We have a working build of our app on CircleCI using resolver: nightly-2017-12-01. The build was for testing our code against GHC 8.2.2. The earliest successful build was at Tuesday, December 5, 2017 at 3:12:14 PM UTC+10. We haven't experienced any build failures on CircleCI to date. The reason seems to be that we use CircleCI's caching facility to cache the directories between builds:

  - ~/.stack
  - .stack-work/downloaded
  - .stack-work/install

I only noticed this integer-gmp-1.0.1.0 problem when doing a local docker build in a fresh container. My local macOS build also works without error because — I guess — of cached metadata in ~/.stack. Today, I was able to reproduce the error on CircleCI by doing "Rebuild without cache".

So, I imagine that the metadata for integer-gmp-1.0.1.0 has been very recently updated on hackage. If that's the case, could this problem be solved by reverting that recent change? BTW, is there a way to view the history of metadata changes on hackage? It would help to confirm the theory.

If I'm not mistaken, the issue of reproducibility in the face of hackage metadata updates has been raised in #2217. Perhaps this problem "in the wild" with integer-gmp-1.0.1.0 can raise the priority of that issue. After all, "What makes stack special? The primary stack design point is reproducible builds".

@tfausak
Copy link
Contributor

tfausak commented Dec 6, 2017

integer-gmp-1.0.1.0 has not been modified since it was uploaded at 2017-12-04T18:44:27Z. If there were revisions, you could see them here: https://hackage.haskell.org/package/integer-gmp-1.0.1.0/revisions/

I didn't think about this before, but the recent upload date (two days ago) made me think of something. I pinged @hvr on Twitter two days ago because I didn't see base-4.10.1.0 (i.e., GHC 8.2.2) on Hackage. I didn't get any explanation for why it was missing or confirmation when it was uploaded, but it showed up at 2017-12-04T19:37:03Z. That's a little less than an hour after integer-gmp-1.0.1.0 was uploaded.

I don't have an explanation for why the new version of integer-gmp broke Stack builds, but the new Hackage index must have something to do with it. It's just weird that the state of the Hackage index would affect a package that's wired in.

@steshaw
Copy link

steshaw commented Dec 6, 2017

Thanks, @tfausak, there doesn't seem to be a link to /revisions on hackage.

So, I take it that the -r0 revision is the initial upload to hackage? My theory went out the window! Very perplexing.

@tfausak
Copy link
Contributor

tfausak commented Dec 6, 2017

The link only shows up if there are any revisions. For example, integer-gmp-1.0.0.1 has a link in the "Updated" section.

-r0 is the initial revision. Revisions start with the original at 0 and increase by 1 each time: -r0, -r1, -r2, and so on. I have no idea why they're shown as -r0 instead of simply 0.

You can read more about revisions in the hackage-trustees repository. I'm not a fan of them, but they didn't cause this particular problem.

@mgsloan
Copy link
Contributor

mgsloan commented Dec 7, 2017

Stackage currently specifies specific revisions as well, so revisions shouldn't break reproducibility.

I'm also unsure of how this made it into the nightlies. Guessing the process there already used Cabal-2.0. Ideally the process would use the latest stable stack.

lexi-lambda added a commit to lexi-lambda/freer-simple that referenced this issue Dec 7, 2017
@snoyberg
Copy link
Contributor

snoyberg commented Dec 7, 2017

The ChangeLog update included in #3304 explains what's happening here: https://github.com/commercialhaskell/stack/pull/3304/files#diff-e705c8fadf1193ab59443a5e6c8cbe8bR4.

GHC includes a number of libraries in its package database, including base, ghc, and integer-gmp. Prior to Stack 1.6 (read: any currently released Stack) had a bug where it would try to get some metadata about these libraries from their cabal files instead of directly from the package database. Historically, this has never been a problem, which is why it's survived in Stack for so long. The reason is that, historically, GHC-shipped packages did not use bleeding-edge features in their cabal files.

When GHC 8.2.1 was released, the ghc.cabal file did something new: it used a feature of the newly released Cabal 2.0 library and required the new Cabal 2.0 file format. This occurred before Stack had a chance to upgrade to Cabal-the-library 2.0, and for that matter before cabal-install 2.0 was released. In other words: at the time the file was placed on Hackage, no officially released version of any common tool supported it.

For unrelated reasons, I'd already fixed this bug on master as part of a refactoring. Strangely enough, that refactoring had to do with problems with revisions. Thanks to the revision system, it's not possible to rely on cabal files on Hackage to tell you anything about GHC-installed packages, since we can't know for certain which revision was used to build the package. (The situation with integer-gmp is even worse in this regard, since the uploaded cabal file appears nowhere in GHC's Git repository. It seems to have been manually altered for the purposes of uploading to Hackage, for reasons that are not clear.)

Anyway, with GHC 8.2.1, we released an emergency patch release of Stack to work around this situation and simply ignore parse failures from ghc.cabal. We did not embark on a bigger fix because:

  1. A bigger fix would involve much more code change, introducing the chance for regressions
  2. We already had a fix on master, and knew that Stack 1.6 would be released before GHC 8.4

What we didn't anticipate was that a package that had been part of GHC's installed database would be manually modified and uploaded to Hackage after the fact. Before this upload, the missing integer-gmp.cabal file was simply ignored by Stack. Once it was uploaded, Stack (again, as a bug) tried to parse it, fails, and gives up.

Finally: the general case of Stack being unable to parse some snapshots is not a bug, it is guaranteed to happen. At some point in the future, Stackage will allow Cabal 2.0-formatted cabal files into snapshots, and then by design Stack 1.5 and earlier will be unable to parse those files. That's unfortunate, but expected. What's unexpected in this case was that (1) these cabal files slipped into a snapshot through the back door (GHC's package database) so quickly, before Stack 1.6 was out the door, and (2) that actions taken post-GHC release (a new upload of integer-gmp.cabal) could affect existing snapshots.

So in sum:

  1. There's a bug in Stack, triggered by new behavior not seen before by GHC
  2. That bug affects reproducibility, because an upload to Hackage in the future (or a revision for that matter) can break existing build plans
  3. This bug is fixed on master fully (AFAICT, we've added an integration test to check for regressions)
  4. Instead of putting out another emergency Stack 1.5 patch for integer-gmp.cabal, we're going to get Stack 1.6 out the door ASAP

I hope that clarifies. This is definitely an unfortunate situation, and I know it's screwed up people's development. If it's any consolation, Cabal 2.0 related issues have taken up a larger portion of my life than I ever would have hoped for.

@steshaw
Copy link

steshaw commented Dec 7, 2017

Hi @snoyberg, thanks for the detailed explanation. I was digging around and too noticed that integer-gmp.cabal in the ghc repository doesn't have any version bounds for ghc-prim. I figure it must be a normal part of the GHC release process though to add version bounds to the core libraries' dependencies when they are uploaded to hackage (because I noticed that earlier versions all have some bounds on ghc-prim and other core libraries like bytestring have version bounds on dependencies on hackage). This doesn't seem to be consistent across all the core libraries though. Unfortunately, this part of the release process isn't well documented.

I'm still curious why I got an initial green build. This build happened after the upload of integer-gmp-1.0.1.0 to hackage. I assume the package wasn't yet present on stack's package index mirror (not sure if I'm using the correct terminology here). However, I don't know where to check the timestamp on that synchronisation.

BTW, this bug hasn't been a hassle for our development here. We were just testing our code against GHC 8.2.2 for hopeful upgrade of our master branch at some point soon (once a Stackage LTS is out). GHC 8.2.2 does compile our Yesod-based app a bit quicker so we're looking forward to it 😄 . I've also tested our build with stack upgrade --git on CircleCI (without caches) and can confirm that it builds fine.

@snoyberg
Copy link
Contributor

snoyberg commented Dec 7, 2017

Most likely the reason it only popped up locally is that, on CI, it was using a cached version of the 01-index.tar.gz file that did not have the new integer-gmp.cabal file. Stack will automatically update, but does so lazily.

lambdageek added a commit to lambdageek/centrinel that referenced this issue Dec 7, 2017
lexi-lambda added a commit to lexi-lambda/freer-simple that referenced this issue Dec 7, 2017
@mgsloan
Copy link
Contributor

mgsloan commented Dec 12, 2017

See haskell-infra/hackage-trustees#120 - happily, this is now resolved, as a revision to integer-gmp has been made - https://hackage.haskell.org/package/integer-gmp-1.0.1.0/revisions/ .

@mgsloan mgsloan closed this as completed Dec 12, 2017
stites pushed a commit to stites/hasktorch that referenced this issue Sep 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants