Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for source-based packages #1972

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

NicholasBHubbard
Copy link
Contributor

This is a copy of #1967 that I created for the purpose of allowing both myself and @jordansissel to be able to write to this PR's branch.

@NicholasBHubbard
Copy link
Contributor Author

Hey @jordansissel, I updated the prototype to build SRPM's for Ruby gems along with CPAN distributions, and was able to to do so in a way that can be further expanded to all package types.

I added instance variables @build_procedure, @install_procedure, and @source_archive to the FPM::Package class. I then modified the FPM::Package#convert method to copy these instance variables from the source package object to the target package object. This means that these variables are now available when a package's output method is invoked. Now any kind of package can support being targeted to a source-based package as long as they set these new instance variables before the convert method is run.

The problem with this approach is that the build and install procedures need to be hard-coded, which means packages need to determine the one and only way to be built and installed. I believe though that we could add flags for the input package types that allow the user to override these variables so you can set the build and install procedures to arbitrary values. How does this sound to you?

Let me know what you think!

P.S.
I invited you as a contributor to my fork, which allows you to write to this branch.

@NicholasBHubbard
Copy link
Contributor Author

Hey @jordansissel, not sure why the new CI is failing, how can we can help?

@jordansissel
Copy link
Owner

Thanks for your continued work on this! I haven’t made time to review any updates this week.

As for CI, you can ignore it for now. It’s new and I haven’t even tried using that yet since a recent PR re-enabled it. I can turn it off if causes too much noise.

@wbraswell
Copy link
Contributor

Howdy @jordansissel , can you please provide feedback on this prototype design so @NicholasBHubbard and I can continue to move forward with development? Thanks in advance! :-)

@jordansissel
Copy link
Owner

jordansissel commented Dec 18, 2022 via email

@jordansissel
Copy link
Owner

The world has gifted my family with covid, so it’s gonna be a bit longer before I find time and energy to provide feedback.

@NicholasBHubbard
Copy link
Contributor Author

I added a prototype for creating source based Debian packages (SDEB's). @wbraswell and I would like to have both SRPM and SDEB support implemented before merging this PR. By requiring ourselves to implement multiple package types, we must focus on adding architecture into FPM that will allow us to implement various source based package types, instead of hardcoding support for just one type.

This SDEB prototype uses architecure similar to the SRPM prototype, where input packages communicate data to the output package via the FPM::Package#convert method. I wrote the prototype to use Debian's dh program for the
debian/rules file. Input packages that can be made into SDEB's can set a dh_args instance variable to control what arguments are passed to dh. Note that only the cpan and gem input types are supported.

I think we should first finish implementing SRPM's before moving on to SDEB's, but hopefully this SDEB prototype gives us a nice jumping off point in the future.

@wbraswell
Copy link
Contributor

Ping @jordansissel

@jordansissel
Copy link
Owner

Working on it -- I spent some time reading the code last night and will test it soon and get y'all feedback.

@wbraswell
Copy link
Contributor

Hi @jordansissel, how would you like to proceed?

@jordansissel
Copy link
Owner

jordansissel commented Mar 2, 2023 via email

@jordansissel
Copy link
Owner

I'm focusing this review on the "make it work" part of fpm's project philosphy which mean we can focus on things like code behavior and user interface, and I will try to avoid commentary on "make it right" aspects (test coverage, handling edge cases, etc) or "make it fast" (code maintenance hazards, execution speed, etc).

First: How much difference should be expected between rpms built by fpm as compared to ones produced by 'fpm -> srpm -> rpmbuild --rebuild' (or other SRPM build systems)?

Second: I haven't been successful yet using this PR. I'm testing it on a relatively fresh Fedora 37 container installing only packages I found are necessary for fpm/ruby/python/perl/etc as well as rpmbuild-related tools. I made some effort to get mock working, but there's some complications with mock and my container environment which prevent me from using mock tonight.

Third: Thinking towards "make it right", but no action required now, what should we use to test build behavior? mock? rpmbuild --rebuild? Others?

The following are package-specific things I found while using this PR:

Ruby gem

Probably missing BuildRequires: rubygems as, without this, rpmbuild errors about missing /usr/bin/gem

rpmbuild --rebuild also fails trying to use the source .gem file; this seems to be due to the cd rails-7.0.4.2 performed by %install automatically. Adding a cd ../ to the line before gem install .... can workaround this. It's possible my rpmbuild environment is atypical, but I've tried to use only a fresh system when testing this.

% rpmbuild --rebuild /tmp/rubygem-rails-7.0.4.2-1.src.rpm
....
+ export CXX
+ cd rails-7.0.4.2
+ gem install rails-7.0.4.2.gem
ERROR:  Could not find a valid gem 'rails-7.0.4.2.gem' (>= 0) in any repository
error: Bad exit status from /var/tmp/rpm-tmp.YcrOIk (%install)

Python

-s python seems to crash:

% bundle exec bin/fpm -s python --python-bin python3 -t srpm -p /tmp django`
...
/home/jls/projects/fpm/lib/fpm/package/srpm.rb:141:in `basename': no implicit conversion of nil into String (TypeError)
	from /home/jls/projects/fpm/lib/fpm/package/srpm.rb:141:in `converted_from'
	from /home/jls/projects/fpm/lib/fpm/package.rb:235:in `convert'

Perl cpan

-s cpan seems to require root to build as files are written to /usr/local --

% bundle exec bin/fpm -s cpan -t srpm -p /tmp Regexp::Common
Created package {:path=>"/tmp/perl-Regexp-Common-2017060201-1.src.rpm"}

% rpmbuild --rebuild /tmp/perl-Regexp-Common-2017060201-1.src.rpm
...
Manifying 32 pod documents
Manifying 1 pod document
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
ERROR: Can't create '/usr/local/share/man/man3'
Do not have write permissions on '/usr/local/share/man/man3'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 at -e line 1.
make: *** [Makefile:888: pure_site_install] Error 13
error: Bad exit status from /var/tmp/rpm-tmp.rGERpa (%install)

But even running rpmbuild as root, as an experiment, it still fails (for me, on a fresh-ish Fedora 37):

% sudo rpmbuild --rebuild /tmp/perl-Regexp-Common-2017060201-1.src.rpm

Processing files: perl-Regexp-Common-debugsource-2017060201-1.aarch64
error: Empty %files file /root/rpmbuild/BUILD/Regexp-Common-2017060201/debugsourcefiles.list

And if we bypass this by adding --nodebuginfo, rpmbuild succeeds, but has no files.

% sudo rpmbuild --nodebuginfo --rebuild /tmp/perl-Regexp-Common-2017060201-1.src.rpm
...
% sudo rpm -qlp /root/rpmbuild/RPMS/aarch64/perl-Regexp-Common-2017060201-1.aarch64.rpm
/

The architecture target for the package is aarch64 when it should be noarch for Regexp::Common (pure perl).

@NicholasBHubbard
Copy link
Contributor Author

Hey @jordansissel, thanks for the response!

The purpose of the SRPM and SDEB prototypes are to display a general architectural design for implementing source-based packages. It is important to @wbraswell and I that we come up with a design that will allow more source-based package types to be supported by FPM in the future.

The bugs you brought up are important and need to be addressed, however I would first like to get your feedback on the general design of the prototypes.

To implement (buildable) source-based packages, the input package needs to communicate its build instructions, install instructions, and source code to the output package.

The way I accomplished this is by adding the instance variables @source_archive, @build_procedure, and @install_procedure to the FPM::Package class. For the package input types I added support for (cpan and gem), from their input method I set these instance variables to relevant values. I then modified the FPM::Package#convert method to copy these new instance variables from the input package type to the output package type. These instance variables are then used by both FPM::Package::SRPM and FPM::Package::SDEB to create (currently buggy) buildable source packages.

What downsides do you see to this approach? Do you see any fundamental flaws? Is there a different way we could do this that would make it easier to add more source-based package types in the future?

@jordansissel
Copy link
Owner

@NicholasBHubbard Alright we can start with that direction (the internal architecture).

At a high level, adding "source" package support is a good time to rethink how fpm "input" operations work specifically to separate what often is three separate steps combined into the one "input" step: acquisition of the package itself, loading the package metadata, and performing the installation step. The 'gem.rb' does this very directly with the input method doing very little other than invoking those three steps.

Maybe: Split "install" steps away from the input method

Source packages may not benefit from the installation step as I would expect that source packages are expected to perform that installation step in a separate tool such as rpmbuild, mock, etc. For example, npm input will perform the installation step, as does gem and others, even though source packages do not benefit from this and it's kind of wasted effort. This waste may not matter for small packages with short installation times, but for complex packages that require long build times, such as rubygems' nokogiri which historically had very long install times due to compiling a long list of things.

Given this, I wonder if it's worth changing the internals to allow targets (like srpm) to skip the 'build/install' step.

Perhaps separating these things into two categories, like "preparation" and "installation", where preparation does the download and metadata loading, and installation does the build/install steps.

I wonder if the installation part could also have different results depending on the target package. For 'rpm' it would install the files to a staging environment (like it does today), but for 'srpm' it could present a set of installation instructions in shell, like what you have today as accessors/instance variables? Unfortunately this probably gets made more difficult by the wide variety of source package systems out there. Shell commands are valuable for RPM SPEC files and Debian package building, but are less useful or entirely unusable in build-from-source systems like Arch, Nix, and FreeBSD. Even Debian is a slight stretch given its heavy use of make and a fleet of Debian-specific utilities used in a typical Debian package.

If we do separate the "install" step outside of the input method, this will create some extra problems to be solved. I don't think they'll be too difficult, though. For example, the npm input queries metadata after installing.

What's common in source package systems?

fpm's original design focused on common aspects among packages -- name, version, summary, dependencies, a set of files, etc. If I think about this approach towards source packages, I find less things in common across package formats. RPM SPEC has an explicit %files section, but Debian and FreeBSD's source packages do not. In source packages, it feels like the simple "set of files" is replaced by domain-specific language and concepts that are unique to each source package system. I wonder how difficult it will be to maintain such a conversion scheme because of such wide differences in source package systems.

This feels like some complexity that we should isolate to a separate area of code and not in accessor attributes in each class. For example, instead of gem knowing what dh_args values are needed, perhaps the source deb output target should be responsible for that knowledge. FPM already does this in certain places where converted_from method allows a target package format to guide what transformations or other specific knowledge are needed in order to produce a good package. For example, rpm.rb's converted_from knows how to transform rubygem concepts into rpm concepts, or deb.rb's focusing debian-specific naming schemes for perl modules.

Following this, things like gem.rb's build_procedure, install_procedure, and dh_args would all move into their respective target package converted_from methods.

Thoughts?

@wbraswell
Copy link
Contributor

@jordansissel
Wow great feedback, thank you sir!
We are reviewing now and will respond soon. :-)

@NicholasBHubbard
Copy link
Contributor Author

Thanks for the ideas Jordan, they sound reasonable.

I am working on updating the prototypes to use these ideas, which hopefully I can submit for you to review in the next few days.

@NicholasBHubbard
Copy link
Contributor Author

NicholasBHubbard commented Aug 30, 2023

Hello @jordansissel and @wbraswell! I've recently found some time to work on this source-based package stuff again. I created a new PR #2024. This PR starts to implement the fundamental architectural change that we discussed with the build/install procedures being extracted out of the input() method. I created a separate PR for this as I am thinking it is a good intermediate step that we could merge before we add real support for source-based package types. This PR updates the internals of FPM::Package, FPM::Package::CPAN, and FPM::Package::Gem to separate out the build/install procedures out of the input() method. This PR breaks FPM for all input types other than cpan and gem.

@jordansissel suggested we separate input() into two methods called prepare() and install(). This is essentially what I did, however I did not rename input() to prepare(), and I called install() build_and_install(). For CPAN and Gem, I was able to just copy and paste the end of their input() methods into the new build_and_install() method. However I also had to make local variables used in this build/install part of the code into instance variables in order to transfer their values from input() to build_and_install(). Arbitrarily changing local variables into instance variables is (disgustingly) hacky ... it could turn into butchering the whole codebase. Im very open to suggestions on this.

I also added a method to the FPM::Package class called shell_code(), which should return
a two element hash of the form {:build_procedure => build_shell_code, :install_procedure => install_shell_code }. Input types that wish to be converted to source-based packages are required to implement this method. Then from the FPM::Package::convert() method, I added the following code before we transfer the instance vars from the input to output package object:

# source packages dont need to actually build the source, instead they need
# the shell code for the build and install procedures.
if pkg.source_pkg?()
  h = shell_code() # h is a two elem hash ...
  @build_procedure = h[:build_procedure]
  @install_procedure = h[:install_procedure]
  ivars += [:@build_procedure, :@install_procedure]
else # binary packages need to be built and installed to the staging_path
  build_and_install
end

So now we detect if we are converting to a source-based package and if so we transfer the shell-code instance variables @build_procedure and @install_procedure to the output package. Otherwise we just perform the build and installation.

I would appreciate your guys feedback on this. If you think that this is a good pattern then we can go ahead and start modifying all the other input package types to work this way. Unfortunately with this change the gem tests failed (the cpan ones still pass). I believe this is because the tests make assertions about system state after invoking the input()
method. It is likely other packages tests will fail as well, so we will have to update their tests as well.

By the way, Im not sure how we could support source-based packages that dont use shell-code. I think for now we should just focus on SRPM and SDEB to try to keep things moving forward.

@wbraswell
Copy link
Contributor

Nice work @NicholasBHubbard !
It all seems good at first glance, although I am not a Ruby programmer so we definitely need @jordansissel to review.
I agree that we should start with SRPM and SDEB only for now, then expand in the future as needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants