-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bundler design document #565
Conversation
999eefb
to
b06b948
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round of review (made it to the Checksums section) 🙂
My general observation is that it's kind of hard to derive implementation guidance from the doc because it's not always clear which parts are just general commentary on RubyGems/Bundler and which parts describe the implementation decisions
67d7eca
to
9b22d17
Compare
7089777
to
9064ba5
Compare
docs/design/rubygems.md
Outdated
### Checksums | ||
|
||
Checksum validation is enabled by default. | ||
It can be disabled with the `BUNDLE_DISABLE_CHECKSUM_VALIDATION` environment variable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I understand it correctly, bundler does do checksum validation even if there are no CHECKSUMS in the lockfile, but it just validates that the downloaded archive matches the checksum reported by the server (similar to how PyPI reports the expected checksums for archives and we use that to filter out non-matching wheels)
I realize now that we lose even that small validation in Cachito, where we just don't validate any checksums at all. Which is not ideal, but we don't have to fix it right away in cachi2. A follow-up story to implement this checksum validation would be nice though.
I noticed that it is necessary to add This would force the user to modify his Dockerfile, which is still something we don't require with Cachi2. I wonder if there's a way to work around that. |
I did a small test trying to build Porta with this PoC here (while disabling the Yarn deps 🙈). The prefetch was successful, but the build failed because it was trying to reach out to the Internet to grab the git dependencies. Were you able to successfully prefetch and build any project that has git deps? I think we should enhance our cachi2-rubygems repo to include at least one git dep. |
I know; that was copied from cachito. Git dependencies must have this format (I did not implement it in the PoC): EDIT:
I tried to set |
The deprecation warning worries me a little. Also, the official docs seem to indicate that this is a required setting. In any case, I think we can add it as an option in the design doc and leave it up for discussion. I wonder if turning all dependencies into path dependencies would work? This is a much more invasive change, and I would not like to turn to it, though. I also noticed that |
Oh, I also did a small test with a git dependency and I was able to build the image. The thing that was not clear to me immediatelly was that the repository files need to be unpacked in a folder following the naming format you mentinoned. Since what the prefetch does is to download a tarball of the repo+ref, it took me some time to figure it out. Maybe we can add a note in the design doc about this? |
c6401e8
to
7c40dc7
Compare
So I dropped PoC from this PR, and with @brunoapimentel, we figured out that using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add steps for how to locally try out the proposed solution?
There have been some fairly significant changes since the last state and I would like to test them
You can play around with my repo: |
@slimreaper35 & @brunoapimentel; Seems like we should also have a #### Out of scope
- plugins
- pre-compiled binaries
- checksum verification
- dev dependencies
- prefetching Bundler section at the end of the doc, to complete the summary. |
docs/design/rubygems.md
Outdated
- **Gemfile.lock**: A file that locks the versions of gems that are installed for your project. Bundler uses this file | ||
to ensure that the correct versions of gems are installed consistently across different environments. | ||
|
||
- **RubyGems**: General package manager for Ruby. Manages installation, updating, and removal of gems globally on your |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: is it correct to call this RubyGem support if only intend to talk to bundler
at best (with good chunk of logic done in our code directly)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we can simply change this to Bundler. What we are proposing does not work with gem
(the RubyGems CLI) standalone.
@slimreaper35 @a-ovchinnikov If you both agree, I can make the changes right away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will rename rubygems
to bundler
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then I'll update the design doc as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MartinBasti FYI for the cachi2 OSBS compatibility layer
docs/design/rubygems.md
Outdated
## II. Overview of the current implementation in Cachito | ||
|
||
[cachito/workers/pkg_mangers/rubygems.py](https://github.com/containerbuildsystem/cachito/blob/master/cachito/workers/pkg_managers/rubygems.py) | ||
|
||
Most work is already done by parsing the `Gemfile.lock` file, which pins all dependencies to exact versions. The only | ||
supported source for gem dependencies to be fetched from is <https://rubygems.org>. Git dependencies are specified | ||
using a repo URL and pinned to a commit hash. Path dependencies are specified using a local path. | ||
|
||
To avoid arbitrary code execution, Bundler **is not used** to download dependencies. Instead, as stated above, Cachito | ||
parses `Gemfile.lock` file directly and download the gems from <https://rubygems.org>. | ||
|
||
**Note**: parsing `Gemfile.lock` is done via [gemlock-parser](https://github.com/containerbuildsystem/gemlock-parser), | ||
which is vendored from | ||
[scancode-toolkit](https://github.com/nexB/scancode-toolkit/blob/develop/src/packagedcode/gemfile_lock.py). | ||
|
||
Source code for "official" Bundler lockfile parsing in Ruby: | ||
<https://github.com/rubygems/rubygems/blob/master/bundler/lib/bundler/lockfile_parser.rb> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're adding this to cachi2 and yes, much of the logic will be ported, but we don't need to mention cachito here, we're trying hard to drop all references to it ultimately to just mentioning in the main README that "origin of this project was cachito, link here..." but I don't think this should be in the design doc since we probably do end up making different decisions and the implementation will also be different in many cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this specific section brings too much value besides linking the main reference we had, the implementation done in Cachito. So I'd be fine with droping it, but I still think there's value to link previous work, specially considering that there was much research done back then.
Why do you think it's bad to reference Cachito? I think it's ok to say we are taking a different path with Cachi2, and the work done before is merely a reference and a starting point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This,
there was much research done back then
and the fact that we often borrow starting code from Cachito, make it a good idea to at least provide links.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original project is to be sunset in the future with this one being its replacement. I don't see value in putting many references to the old project we don't really want to maintain live connections to, so while I'm okay stating that parts of the code were borrowed, linking at too many places doesn't IMO achieve what we're ultimately trying to do, because the code, yet borrowed, has to be often significantly adapted to this very code base, so providing links to code which will be 75% reworked is IMO misleading and confusing. It should only ever be nothing more than a kudos/credit to where we got inspired and that's it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I guess we disagree on this point. I do agree that this section as it was did not bring too much value, so I moved the content to a "References" section in the end of the document, and made it shorter to just convey the credits and links to the original project.
ca13a30
to
7ad1e79
Compare
New push:
|
Drop the gitignore commit since it is already in a separate PR. |
This is the first commit that starts implementation of bundler package manager introduced in: containerbuildsystem#565 By adding a package manager to dev section, we expose it from the user for a period of time until the package manager is fully supported. Signed-off-by: Michal Šoltis <[email protected]>
/retest |
Design document that covers the planned implementation of Bundler¹ support in Cachi2. Note that this design doc draws inspiration for the implementation done in Cachi2, but takes a different approach when it comes to providing the content for performing the hermetic builds and for the SBOM data generation. Since design documents are meant to help all contributors to understand the implementation and logic behind any feature, we have decided to add them as part of the repository's documentation, starting from this one. ¹ https://bundler.io/ Signed-off-by: Michal Šoltis <[email protected]> Co-authored-by: Bruno Pimentel <[email protected]>
20eb5f0
This is the first commit that starts implementation of bundler package manager introduced in: containerbuildsystem#565 By adding a package manager to dev section, we expose it from the user for a period of time until the package manager is fully supported. Signed-off-by: Michal Šoltis <[email protected]>
This is the first commit that starts implementation of bundler package manager introduced in: #565 By adding a package manager to dev section, we expose it from the user for a period of time until the package manager is fully supported. Signed-off-by: Michal Šoltis <[email protected]>
Maintainers will complete the following section
Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:
/ok-to-test
(as is the standard for Pipelines as Code)