Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use cache urls from opam config #337

Merged
merged 2 commits into from
Nov 9, 2022

Conversation

hannesm
Copy link
Contributor

@hannesm hannesm commented Oct 14, 2022

fixes #336 (it now respects the archive-mirrors in ~/.opam/config -- thanks to @kit-ty-kate)

@Leonidas-from-XIV
Copy link
Member

Hi @hannesm, thanks for the PR. If I understand it correctly, pull_tree will get the tarballs from a different URL in that case using some kind of mapping scheme from the URL to the tarball (I have a hard time finding documentation on archive-mirrors). Can you add an entry to the changelog to explain the change from a user POV? That will also make the CI happy.

I wonder: can you use the archive-mirrors setting to redirect the download to a local folder? That could be extremely useful for the tests, because then we could actually test pull without having to hit the internet, by setting up a mirror and using that to pull the tarballs.

@hannesm
Copy link
Contributor Author

hannesm commented Nov 9, 2022

If I understand it correctly, pull_tree will get the tarballs from a different URL in that case using some kind of mapping scheme from the URL to the tarball

Yes and no, so it'll use the cache offered by opam instead of "only the main source of the tarball". Since the tarballs should have checksums assigned, it shouldn't matter where you get the data from. :) (but in CI systems and in remote locations, using a local mirror // something that doesn't overload github's restrictions is a fine way to save bandwidth)

(I have a hard time finding documentation on archive-mirrors).

So, archive-mirrors can be defined in your opam config (apart from some issues in the 2.1 series, see ocaml/opam#4411 ocaml/opam#4830), and/or in the repo file of the repository (see for documentation
https://opam.ocaml.org/doc/Manual.html#repofield-archive-mirrors https://opam.ocaml.org/doc/Manual.html#configfield-archive-mirrors). Opam first tries to download from the archive-mirror (e.g. opam.ocaml.org sets it to cache, so the default cache for the default repository is https://opam.ocaml.org/cache) by using the URL <hash-algo>/<first 2 chars of the hex-encoded hash>/<full hex-encoded hash>.

I wonder: can you use the archive-mirrors setting to redirect the download to a local folder?

I don't know, try it out.

@Leonidas-from-XIV
Copy link
Member

Well, it sounds like a reasonable change (I first was worried that it would rewrite the URLs in the lockfile but as the cache URL switcheroo happens on pull and not lock it still yields reproduceable lockfiles), so if you add a changelog entry I think it is ok to merge.

I'll evaluate whether local URLs can be used.

@Leonidas-from-XIV
Copy link
Member

I'll evaluate whether local URLs can be used.

I tested with a local (file system) OPAM repo with

opam-version: "2.0"
archive-mirrors: "cache/"

coupled with a tarball in the location you described and it worked right out of the box. Perfect, this will allow us to test pull in a sandbox environment by just preparing a cache with tarballs.

@hannesm
Copy link
Contributor Author

hannesm commented Nov 9, 2022

Great to hear that @Leonidas-from-XIV. I pushed a changes entry.

@Leonidas-from-XIV Leonidas-from-XIV merged commit 94b1087 into tarides:main Nov 9, 2022
@Leonidas-from-XIV
Copy link
Member

Thanks for the contribution!

@hannesm
Copy link
Contributor Author

hannesm commented Nov 10, 2022

Sorry to bother you, but it looks like this now leads to opam-monorepo that does not compile anymore, the reason is a bad rebase / conflict between #317 being merged (especially 78000f4 "Use an explicit cache to avoid using the global cache").

With the current HEAD and the commit before my change, I understand the codebase even less. Would you mind to clarify why "use an explicit cache" is the desired behaviour? Who's going to support that argument (with a Fpath.t)? Why does opam.mli now contain pull_tree_with_cache and pull_tree, which of the functions is actually used and why? I guess I just do not understand why not to use the opam cache...

@Leonidas-from-XIV
Copy link
Member

This commit is to optimize the retrieval of repositories. Originally the way to use custom repos was to opam repo add (and let opam deal with caching it) but people tended to forget it (especially as opam-overlays is often needed) and in some cases people just added it for the run of opam-monorepo, then immediately removing it (I believe that this was the case for mirage at some point).

So x-opam-monorepo-opam-repositories was added which would ignore the repos in the OPAM state (thus improving reproducibility, since the documentation which repo the package info from came from was in the lockfiles) and explicitly specify them from that option. But since we don't want to re-download potentially big git repos like opam-repository every time you run opam-monorepo lock this commit introduced a local cache for these repos that was not part of OPAM.

Leonidas-from-XIV added a commit to Leonidas-from-XIV/opam-monorepo that referenced this pull request Nov 10, 2022
Merging tarides#337 lead to a case where the merging didn't yield a working
solution.

This PR fixes it by changing the code to:

- Retrieve the cache urls from the `global_state` as tarides#337 does
- Do not pass cache urls if we're not dealing with an OPAM state
@Leonidas-from-XIV
Copy link
Member

I have created #345 to fix the main branch for now.

@hannesm
Copy link
Contributor Author

hannesm commented Nov 10, 2022

Thanks. I still don't quite understand which code path will be taken at what point and why. My concern is mainly that local caches are ignored, although they're present in ~/.opam/config. Given that the workflow of mirage indeed is to opam repo add when opam-monorepo is being called (and opam repo remove afterwards), I guess that will be ok.

Good news, I tested your PR with mirage 4.2.1 and it uses the archive-mirrors cache that I have in my ~/.opam/config.

@Leonidas-from-XIV
Copy link
Member

The local cache will be ignored when locking when you specify x-opam-monorepo-opam-repositories when locking, as it will clone the git repos into its own cache directory.

Generally there are two modes: "use the OPAM state" and "configure stuff in opam-monorepo", whereas we try to push people to the latter, since that will yield more reproducible lock files, since the OPAM state can't be easily controlled whereas you can specify a revision of the git repos in x-opam-monorepo-opam-repositories.

samoht added a commit to Leonidas-from-XIV/opam-monorepo that referenced this pull request Nov 11, 2022
Merging tarides#337 lead to a case where the merging didn't yield a working
solution.

This PR fixes it by changing the code to:

- Retrieve the cache urls from the `global_state` as tarides#337 does
- Do not pass cache urls if we're not dealing with an OPAM state
Leonidas-from-XIV added a commit to Leonidas-from-XIV/opam-repository that referenced this pull request Nov 21, 2022
CHANGES:

### Added

- Add support for specifying remote URLs in `x-opam-monorepo-repositories`
  (tarides/opam-monorepo#284, tarides/opam-monorepo#317, @Leonidas-from-XIV)

### Fixed

- Enable locking of packages with depexts even with uninitialized system
  package manager state (tarides/opam-monorepo#322, @Leonidas-from-XIV)
- Fix a bug where `pull` would crash if the lock file contained no package to
  vendor (tarides/opam-monorepo#321, @NathanReb)
- Display a better error message when the depext command fails when getting the
  status of the packages (tarides/opam-monorepo#258, tarides/opam-monorepo#323, @RyanGibb, @Julow)
- Take `archive-mirrors` from the global opam configuration into account to
  allow more local caches (tarides/opam-monorepo#337, @hannesm)
- Log at WARN level when opam-monorepo chooses a source for a package that
  doesn't match the package's version (tarides/opam-monorepo#352, @reynir)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The repository "archive-mirrors" is not respected in a pull, thus downloads are always from the Internet.
2 participants