Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.0: Vendor re2 and its dependencies and drop support for older Ruby versions #85

Merged
merged 54 commits into from
Sep 13, 2023

Conversation

mudge
Copy link
Owner

@mudge mudge commented Aug 11, 2023

I'm raising this so we can look at all of @stanhu's changes at once and get this merged into main, replacing the current version 1 of the gem.

Once we're happy with the changes, I propose we release a prerelease gem (e.g. 2.0.0.beta1) so we can smoke test with real users on various platforms and Ruby versions before pushing ahead to a full 2.0.0 release.

To do

  • Document native gem dependencies
  • Document opting out of the native gem
  • Update re2 to 2023-09-01

stanhu added 30 commits July 15, 2023 08:04
If `--disable-system-libraries` is specified, this change will
download and build abseil and re2. `cmake` and a C++17 compiler is
required for this to work.

This makes it possible to ensure all the required dependencies are
contained within the C extension rather than depend on system
libraries, which may break the extension when they are updated.

The building of these libraries uses mini_portile2 and techniques
borrowed from the nokogiri and ruby-magic gems.

Closes #61
mini_portile2 requires >= Ruby 2.3, so we need to drop support
for obsolete Ruby versions.
This increases the size of the gem from 32K to 2.4 MB, but this will
significantly reduce network transfer and avoid rate limits.
Running `rake gem:native` will attempt to cross-compile gems
for the following platforms:

-  aarch64-linux
-  arm64-darwin
-  x86_64-darwin
-  x86_64-linux

`rake gem:<platform>` also will compile one specific platform. For
 example: `rake gem:arm64-darwin`.
mini_portile2 provides `configure` script with the `--host` parameter,
and the script figures out the target compilers to use. With CMake,
we don't have that benefit so we need to set the C and C++ compilers
ourselves.

We use a simple heuristic to find the target compilers:

1. For macOS, we use `clang` and `clang++. Otherwise we use `gcc` and
   `gcc++`.
2. We search the PATH for `<host>-compiler`. Use that if we find it.
   Otherwise default to the base compilers.

To make CMake work with cross-compilation, a number of variables have
to be set:

- CMAKE_SYSTEM_PROCESSOR
- CMAKE_SYSTEM_NAME
- CMAKE_C_COMPILER
- CMAKE_CXX_COMPILER
Windows builds were previously failing because mini_portile2 adds an
unnecessary option to the `cmake` configure step.

This commit adds other targets:

-  arm-linux
-  arm64-darwin
-  x64-mingw-ucrt
-  x86-linux
-  x86-mingw32
This commit also adds an environment variable,
`RE2_USE_SYSTEM_LIBRARIES`, to enable system libraries.

Add some documentation about mini_portile2 workarounds and improve
CMake cross-compilation usage.
Older versions of Ruby have reached end-of-life. Shipping
precompiled gems means that we have to include libraries for all
supported versions.

Closes #68
This fixes an issue where the re2 headers could not be found when
building the gem locally. It appears `pkg_config` doesn't use
PKG_CONFIG_PATH unless the library has been registered with
`dir_config` in the first place.
add_vendored_libraries did not work properly.
This is needed to build the native gem.
This pulls in flavorjones/mini_portile#129.
When cross-compiling Windows targets with cmake in a Linux
environment, the MSYS generator may not be available. Supplying -G
MSYS will cause the build to fail.
This pulls in flavorjones/mini_portile#130.
We no longer need to write custom logic for finding the right
compilers and setting CMake variables for cross-compilation.
This ensures the generated gem has a unique version number, so define
`RE2::VERSION` in a separate file to make this possible.
Previously sence `rake spec` required the `compile` step,
`scripts/test-gem-install` failed because `extconf.rb` was
removed. Fix this by making the default task depend on the `compile`
task, but the `spec` task independent of the `compile` task.
This ensures that all precompiled gems are built properly and can be
run on their target operating systems.

Closes #69
This reduces the filename lengths since Windows can hit a limit of 250
characters (CMAKE_OBJECT_PATH_MAX).
These are required dependencies for abseil until a custom musl gem is
built:
#67 (comment)
On systems where pkgconf v1.9.3 is used for pkg-config, the output of
`pkg-config --static --libs` is incorrect, resulting in build failures
(pkgconf/pkgconf#268). This commit fixes
the issue by hard-coding the abseil libraries needed to link the
extension properly.
Rather than remove all files and conditionally guard against
version.rb load errors, just include the required files for testing
in the gem and test against the installed directory.
This avoids the visibility setting warnings in the linker
(https://github.com/mudge/re2/actions/runs/5631492710/job/15258506852)
and reduces the size of the binaries.
The script for installing the package did not actually execute
because `pwsh` was loaded instead of `bash`.
As of today, the rake-compiler-dock image for:

1. `mri-x64-mingw-ucrt` only supports Ruby 3.1 and 3.2.
2. `mri-x64-mingw32` only supports Ruby 2.4 - 3.0.
This speeds up building of all gems. rake-compiler by default will
attempt to launch N + 4 jobs, where N is the number of cores on the
system.
This makes it easy to build all gems locally by running
`scripts/build-gems`.
Due to pkgconf/pkgconf#268, we manually
worked around Windows link order issues by adding abseil libraries
manually. However, a few libraries were missed from that list, which
are added only in Windows builds:

* dbghelp
* advapi32

This commit adds these libraries to the right location, which fixes
`x86-mingw32` builds.

Note there is an existing issue with abseil including `-ldbghelp`
before `-labsl_symbolize`, which this commit corrects:
abseil/abseil-cpp#1497
@mudge
Copy link
Owner Author

mudge commented Sep 10, 2023

I’ll wait for the abseil workaround removal in #96 as I’m looking forward to that before publishing and announcing the new beta.

@stanhu
Copy link
Collaborator

stanhu commented Sep 10, 2023

@mudge I forgot that dropping the workaround requires upgrading abseil as well due to abseil/abseil-cpp#1497. 😄

@mudge
Copy link
Owner Author

mudge commented Sep 10, 2023

Darn! Oh well, let’s cut the next beta from this branch as-is then.

@stanhu
Copy link
Collaborator

stanhu commented Sep 10, 2023

@mudge Ok, do you want to handle it? I have to go now. https://github.com/mudge/re2/actions/runs/6137906858 has the gems.

@mudge
Copy link
Owner Author

mudge commented Sep 10, 2023

New beta2 gem now pushed to RubyGems.

@stanhu
Copy link
Collaborator

stanhu commented Sep 11, 2023

New beta2 gem now pushed to RubyGems.

Awesome, thanks! I have confirmed that all GitLab package builds, including arm64 and Raspberry Pi 32-bit builds from source, were successful with v2.0.0.beta2 (https://gitlab.com/gitlab-org/build/omnibus-gitlab-mirror/-/pipelines/998940657).

@mudge
Copy link
Owner Author

mudge commented Sep 11, 2023

Is there anything else we'd like to get into this release, e.g. the extra tests from https://github.com/mudge/re2/milestone/1?

Otherwise, let's wait for any bug reports this week and get this merged?

@stanhu
Copy link
Collaborator

stanhu commented Sep 11, 2023

Is there anything else we'd like to get into this release

No, personally I'm happy where we are now. Thanks!

@stanhu
Copy link
Collaborator

stanhu commented Sep 12, 2023

I think this is ready to go! GitLab.com has been running with 2.0.0.beta2 in production for almost 24 hours with no issues.

@mudge
Copy link
Owner Author

mudge commented Sep 12, 2023

That’s great news, I’ll see about getting a release out and publicised tomorrow.

@stanhu
Copy link
Collaborator

stanhu commented Sep 12, 2023

In addition, I haven't heard any issues from any developers yet. Whenever Homebrew releases a new re2 update, I usually see a number of issues that usually resolves with bundle pristine re2. This is especially painful for UX designers who don't know anything about Ruby.

@mudge Thanks for creating this gem and being so responsive!

@mudge
Copy link
Owner Author

mudge commented Sep 12, 2023

The usability for most users is so much better with the native gems but figuring out exactly how to vendor re2, let alone precompile it for multiple platforms always seemed an insurmountable task to me.

As hopefully everyone can see from this diff, this has been a Herculean contribution from you and vastly improves the gem so thank you very much for your perseverance!

@mudge
Copy link
Owner Author

mudge commented Sep 13, 2023

Do you know how long it typically takes for users to upgrade their instances of GitLab? How long should we wait for any issues to come out of the woodwork before putting out a proper 2.0 release? I can see around 45,000 downloads (almost all x86_64-linux) from RubyGems so far.

@stanhu
Copy link
Collaborator

stanhu commented Sep 13, 2023

Do you know how long it typically takes for users to upgrade their instances of GitLab? How long should we wait for any issues to come out of the woodwork before putting out a proper 2.0 release? I can see around 45,000 downloads (almost all x86_64-linux) from RubyGems so far.

The next GitLab release will be on September 22, but I would just go ahead with a proper 2.0 release now since fundamentally little has changed in terms of functionality.

There are various ports such as FreeBSD that install from source, but @mfechner tested building v2.0.0beta2 with --enable-system-libraries and said it built fine: https://pkg.fechner.net/data/132amd64-gitlab/2023-09-13_14h19m36s/logs/rubygem-re2-2.0.0.beta2.log

stanhu and others added 11 commits September 13, 2023 15:58
Ruby's mkmf appears to list `RbConfig::CONFIG['exec_prefix']/lib` first
in the search path when building a C extension. If you have libre2
installed in that path, the linker will use that library instead of
the vendored library. This adds a test to ensure this case is covered.
Ruby's mkmf appears to list `RbConfig::CONFIG['exec_prefix']/lib`
first in the search path when building a C extension. If you have
libre2 installed in that path, the linker will use that library
instead of the vendored library. This causes an undefined symbol error
when using the extension.

```
/usr/local/bin/ruby: symbol lookup error: /usr/local/bundle/gems/re2-2.0.0.beta1/lib/re2.so: undefined symbol: _ZN3re23RE2C1ESt17basic_string_viewIcSt11char_traitsIcEERKNS0_7OptionsE
```

This commit does two things to ensure the final shared library is
compiled statically with the vendored libraries:

1. For -L<path> flags, ensure that any `ports` paths are prioritized just
   in case there are installed libraries that might take precedence.

2. For -l<lib> flags, convert the library to the static library with a
   full path and substitute the absolute static library.  For example,
  -lre2 maps to /path/to/ports/<arch>/libre2/<version>/lib/libre2.a.
The version constraint was present to handle older Ruby versions, but
now that Ruby 2.7 is required we can just require Rake 12.3.2+.
Note we're not also upgrading abseil to its latest version (20230802.0)
as it drops compatibility for macOS 10.13 and requires workarounds on
Windows (see abseil/abseil-cpp#1510).
To simplify the build pipeline, building gems will use whatever version
is specified in the gemspec (which, in turn, comes from
lib/re2/version.rb). Rather than dynamically changing this for certain
CI jobs, remove the task altogether even though this increases the risk
of downloading and publishing the "wrong" gem for maintainers.
Taking a [leaf from the SQLite3
gem](https://github.com/sparklemotion/sqlite3-ruby/blob/48b3e8dba618648598b777e73974d46b44091bad/INSTALLATION.md),
document how to opt out of the precompiled native gems by specifying the
Ruby platform when installing the gem.
As the default system Ruby on macOS Monterey is 2.6 and MiniPortile2
supports back to 2.3, restore support for that Ruby and compile the C
extension for it in the native gems.

The trade-off here is that we increase the size of the native gem in
order to support more Ruby versions. While 2.6 is technically EOL, the
rubygems.org stats still show usage of it in the past month:
https://ui.honeycomb.io/ruby-together/datasets/rubygems.org/result/HBSqTboW1yi

Note Ruby 2.6 triggers a register error with C++17 (it was fixed by
ruby/ruby@113bef6
in Ruby 2.7) so we disable that particular error during compilation.
@mudge mudge marked this pull request as ready for review September 13, 2023 16:47
@mudge mudge merged commit 7e499c2 into main Sep 13, 2023
@mudge mudge deleted the v2.0.x branch September 13, 2023 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants