Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds bootstrapping wheels for the build tool #238

Merged
merged 9 commits into from
Apr 5, 2021
Merged

Conversation

kushaldas
Copy link
Contributor

@kushaldas kushaldas commented Mar 31, 2021

How to test?

Remember to put your OpenPGP public key in the pubkeys/ directory as maintainer_name.pub. Right now we have the release key, and @kushaldas's key there.

In future we can create another shell script for the whole bootstrapping process using the commands below.

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install pip-tools
pip-compile --allow-unsafe --generate-hashes --output-file=requirements.txt requirements.in
python3 -m pip install -r requirements.txt
./scripts/build-sync-wheels --cache ./bootstrap -p $PWD
BOOTSTRAP=true ./scripts/sync-sha256sums
gpg --armor --output bootstrap-sha256sums.txt.asc --detach-sig  bootstrap-sha256sums.txt
BOOTSTRAP=true ./scripts/verify-sha256sum-signature
# Now update the build-requirements.txt
PKG_DIR=$PWD BOOTSTRAP=true ./scripts/update-requirements
  • python3 -m pytest -vv tests/test_reproducible_wheels.py

What is happening in the test?

In the test we are actually installing our bootstrapped wheels into the virtualenv, and then using the existing scripts to build the wheels. You can see the same in the readme of the branch https://github.com/freedomofpress/securedrop-debian-packaging/tree/build_with_build#0-enable-the-virtualenv

Points to remember

  • idna project is getting extracted with different permissions than our previous wheels. Means different sha256sum than our existing wheel.
  • All extension based wheels will have a different sha256sums than our existing wheels.

@kushaldas
Copy link
Contributor Author

Here is the difference for idna project.

✦ ❯ diffoscope localwheels/idna-2.7-py2.py3-none-any.whl /tmp/dtmpdir/idna-2.7-py2.py3-none-any.whl 
--- localwheels/idna-2.7-py2.py3-none-any.whl
+++ /tmp/dtmpdir/idna-2.7-py2.py3-none-any.whl
├── zipinfo /dev/stdin
│ @@ -3,13 +3,13 @@
│  -rw-r--r--  2.0 unx     3299 b- defN 11-Jun-29 20:23 idna/codec.py
│  -rw-r--r--  2.0 unx      232 b- defN 11-Jun-29 20:23 idna/compat.py
│  -rw-r--r--  2.0 unx    11858 b- defN 11-Jun-29 20:23 idna/core.py
│  -rw-r--r--  2.0 unx    39285 b- defN 11-Jun-29 20:23 idna/idnadata.py
│  -rw-r--r--  2.0 unx     1749 b- defN 11-Jun-29 20:23 idna/intranges.py
│  -rw-r--r--  2.0 unx       21 b- defN 11-Jun-29 20:23 idna/package_data.py
│  -rw-r--r--  2.0 unx   197803 b- defN 11-Jun-29 20:23 idna/uts46data.py
│ --rwxr-xr-x  2.0 unx     3947 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/LICENSE.rst
│ +-rwx------  2.0 unx     3947 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/LICENSE.rst
│  -rw-r--r--  2.0 unx     8866 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/METADATA
│  -rw-r--r--  2.0 unx      110 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/WHEEL
│ --rwxr-xr-x  2.0 unx        5 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/top_level.txt
│ +-rwx------  2.0 unx        5 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/top_level.txt
│  ?rw-rw-r--  2.0 unx      945 b- defN 11-Jun-29 20:23 idna-2.7.dist-info/RECORD
│  13 files, 268178 bytes uncompressed, 56675 bytes compressed:  78.9%

@kushaldas
Copy link
Contributor Author

kushaldas commented Mar 31, 2021

Just now noticed that the reprotest-wheels CI step is running on Ubuntu, but all of our wheels are for Python3.7 on Debian Buster.
https://github.com/freedomofpress/securedrop-debian-packaging/blob/e960cd988f4687d49580b80fb12262958330a423/.circleci/config.yml#L248 We will have to figure out how to use the same Buster here.

I will push another time to update this. The current test failure is happening due to mismatch of the sha256sum of the bootstrap wheels, which are not matching to what I pushed from my local system. I will dig more.

https://circleci.com/api/v1.1/project/github/freedomofpress/securedrop-debian-packaging/7713/output/102/0?file=true&allocation-id=60647230441cc96504afc970-0-build%2F389C71BD

@kushaldas kushaldas force-pushed the build_with_build branch 3 times, most recently from a05c2a5 to 4e7da57 Compare March 31, 2021 13:52
@kushaldas
Copy link
Contributor Author

kushaldas commented Mar 31, 2021

Q: why is the sha256sum of the wheel from my local system (which I pushed to git-lfs) is not the same as in the git checkout in CI?

✦ ❯ sha256sum bootstrap/pytoml-0.1.21-py2.py3-none-any.whl 
97e4f6bd5d1d2a32f82d5c6ec9bb90a1b60db3af81b6427ee9a5949021d56d7b  bootstrap/pytoml-0.1.21-py2.py3-none-any.whl

securedrop-debian-packaging on  build_with_build [?] via 🐍 v3.7.3 (.venv) 
✦ ❯ ssh -p 54782 35.229.44.114
The authenticity of host '[35.229.44.114]:54782 ([35.229.44.114]:54782)' can't be established.
RSA key fingerprint is SHA256:MSF+7h9ohV1i7XtAsB0i6SH7JKK2Vh7exkcxI8lWTcE.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '[35.229.44.114]:54782' (RSA) to the list of known hosts.
circleci@default-226c44bb-d766-4fc8-b7db-ca0c5d54efdb:~$ cd project/  
circleci@default-226c44bb-d766-4fc8-b7db-ca0c5d54efdb:~/project$ sha256sum bootstrap/pytoml-0.1.21-py2.py3-none-any.whl 
396f0941f31f1c032a94e3963499e2bb3b6b06591e93461fe24a45c711940f9e  bootstrap/pytoml-0.1.21-py2.py3-none-any.whl

Thanks to @zenmonkeykstop now I know the answer, it is git-lfs :)

@zenmonkeykstop
Copy link
Contributor

with git-lfs installed first, most dependencies match, except for Cython:

...
Processing ./bootstrap/zipp-3.4.1-py3-none-any.whl
Processing ./bootstrap/pip-21.0.1-py3-none-any.whl
Processing ./bootstrap/setuptools-54.0.0-py3-none-any.whl
ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    cython==0.29.22 from file:///home/circleci/project/bootstrap/Cython-0.29.22.tar.gz (from -r build-requirements.txt (line 2)):
        Expected sha256 8c9eec7e9de2a30861ca347d0a149cc1482de12fc765fa06c414930e8ce20d0a
             Got        df6b83c7a6d1d967ea89a2903e4a931377634a297459652e4551734c48195406

But! I think this is legitimate, the expected hash is for the .whl and the returned hash is actually correct for the tarball on a local checkout.

@zenmonkeykstop zenmonkeykstop force-pushed the build_with_build branch 3 times, most recently from 9d30621 to e137924 Compare March 31, 2021 23:00
@conorsch
Copy link
Contributor

conorsch commented Apr 1, 2021

Took an initial spin through. Solid work here. My questions at present are mostly around maintanance tasks.

Why not store the new wheels for build and its dependencies inside the localwheels/ directory? We already have a directory for wheels we've built ourselves, but we're not using it in this PR. Conditional if/else logic has been added to most scripts, to support targeting either bootstrap/ or localwheels/, but it seems they'd work just as well unchanged, if the wheels were added to localwheels/.

What is build-requirements.txt used for? It was added in this PR, but I don't see it referenced in any of the test plans, or in the scripts. As of this PR, this repo now has three (3) Python requirements files. Coupled with the detached sig logic re-confirming the same SHA256 checksums as in the various component repos' build-requirements files, that's a lot of machinery to maintain. If we can remove build-requirements.txt, then we're back down to two (2), same as on the main branch.

Based on prior discussion, mostly in #218, the benefits of these changes appear to be:

  • we can ensure that the localwheels/*.whl files are built reproducibly, even when pip upgrades introduce breaking changes
  • we can potentially build the packages offline, given that even the build dependencies are included (does --no-isolate preclude use of offline builds?)
  • we can make CI faster, by installing the build deps from wheels, rather than building from source (as you point out in pip-21.0.1 is breaking reproducible wheels #218 (comment))

Are there more benefits you'd like to call the team's attention to?

Regarding the failing CI, I haven't actually taken a look at that part yet, although @zenmonkeykstop has some pretty solid leads above. Thanks for preparing, @kushaldas, looking forward to a deeper review.

@kushaldas
Copy link
Contributor Author

@conorsch

Why not store the new wheels for build and its dependencies inside the localwheels/ directory? We already have a directory for wheels we've built ourselves, but we're not using it in this PR. Conditional if/else logic has been added to most scripts, to support targeting either bootstrap/ or localwheels/, but it seems they'd work just as well unchanged, if the wheels were added to localwheels/.

Because I am signing the wheels for the bootstrap process, and I did not want to rebuild and resign the sha256sums.txt
file in the same PR. We can later move all wheels in the same place, but at that time, we should regenerate all the wheels, and sign the list. Also, I was not sure if there were any common wheel between localwheels and bootstrapping.
Keeping them separate makes things easier to create the process.

What is build-requirements.txt used for? It was added in this PR, but I don't see it referenced in any of the test plans, or in the scripts. As of this PR, this repo now has three (3) Python requirements files. Coupled with the detached sig logic re-confirming the same SHA256 checksums as in the various component repos' build-requirements files, that's a lot of machinery to maintain. If we can remove build-requirements.txt, then we're back down to two (2), same as on the main branch.

We are using the build-requirements.txt file in the tests. This file contains the sha256sums of all the wheels required for the bootstrapping process. We have it mentioned in the README too in the steps. From now on when
we generate/rebuild our wheels, we will have to install the build tool in a virtualenv using this build-requirements.txt file. Just like we are using the similar file in Debian packaging, instead here we are using it for bootstrapping.

Based on prior discussion, mostly in #218, the benefits of these changes appear to be:
we can ensure that the localwheels/*.whl files are built reproducibly, even when pip upgrades introduce breaking changes
we can potentially build the packages offline, given that even the build dependencies are included (does --no-isolate preclude use of offline builds?)

We need to use --no-isolate to make sure that the packages we are building get thier proper build dependencies. By default the tool isolates the build environment.

we can make CI faster, by installing the build deps from wheels, rather than building from source (as you point out in #218 (comment))
Are there more benefits you'd like to call the team's attention to?

As mentioned by upstream pip author, there will be no code execution during the installation from the wheels.

@zenmonkeykstop

with git-lfs installed first, most dependencies match, except for Cython:

...
Processing ./bootstrap/zipp-3.4.1-py3-none-any.whl
Processing ./bootstrap/pip-21.0.1-py3-none-any.whl
Processing ./bootstrap/setuptools-54.0.0-py3-none-any.whl
ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    cython==0.29.22 from file:///home/circleci/project/bootstrap/Cython-0.29.22.tar.gz (from -r build-requirements.txt (line 2)):
        Expected sha256 8c9eec7e9de2a30861ca347d0a149cc1482de12fc765fa06c414930e8ce20d0a
             Got        df6b83c7a6d1d967ea89a2903e4a931377634a297459652e4551734c48195406

But! I think this is legitimate, the expected hash is for the .whl and the returned hash is actually correct for the tarball on a local checkout.

This was happening as we are using Ubuntu Focal with Python3.8 on CI instead of the Debian Buster with Python3.7. For now I added a Cython wheel for the Python3.8 (built via same steps on a Focal box). This unblocks us for the CI.

@conorsch conorsch changed the title Adds bootsrapping wheels for the build tool Adds bootstrapping wheels for the build tool Apr 1, 2021
kushaldas and others added 8 commits April 1, 2021 17:16
This now works.
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r build-requirements.txt
pytest -vvs tests/test_reproducible_wheels.py

https://github.com/pypa/build
Required for new-style wheel builds, using the "build" invocation.
The "vary" and "variations" options can be confusing. We want to specify
a blocklist for tests NOT to run, running all tests not explicitly
excluded.
This will help us to reuse our `build-sync-wheels` script
to update our own bootstrapping wheels in future.
```
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install pip-tools
pip-compile --allow-unsafe --generate-hashes --output-file=requirements.txt requirements.in
python3 -m pip install -r requirements.txt
./scripts/build-sync-wheels --cache ./bootstrap -p $PWD
BOOTSTRAP=true ./scripts/sync-sha256sums
gpg --armor --output bootstrap-sha256sums.txt.asc --detach-sig  bootstrap-sha256sums.txt
BOOTSTRAP=true ./scripts/verify-sha256sum-signature
PKG_DIR=$PWD BOOTSTRAP=true ./scripts/update-requirements
```

This PR also updates the CI steps for the reproducible wheels test.
We were previously installing git-lfs from a tarball, because Debian 9
didn't have it in the distro repos. Debian 10 does, so we'll just use
that.

Updates the lfs dirs to include the new "bootstrap" location.
Removes all use of Ubuntu VM images.
The initial installation of pip-tools we can compare hashes with what's
on PyPI. Thereafter, installation will be trusting the pinned reqs file
during initial bootstrapping, and the custom-built wheel going forward.
@conorsch
Copy link
Contributor

conorsch commented Apr 2, 2021

Made some small adjustments to the CI logic:

  • No more Ubuntu VMs, uses Debian Buster containers throughout
  • Removed py3.8 cython wheel, which was added to satisfy Ubuntu reqs
  • Added pip-tools to the bootstrapping reqs file, so tarball hash is verified
  • Removes tarball git-lfs installation, since Debian Buster has it in the repos

Please have a look, @kushaldas. If you agree we're in good shape, then please go ahead and merge. The next steps as I see them are:

  1. Rebuild all wheels for all deps. Based on local testing, this will only modify ~3 deps, those with .so files shipped in the wheels. Builds are not currently blocked, so this isn't critical, but we can follow up shortly after merging to keep things consistent.
  2. Consolidate the bootstrap/*.whl files into localwheels/*.whl. Let's wait and observe what the maintenance burden is with the setup as presented here—we may decide to keep it how it is.

Is there anything else you can point to, that we should handle shortly after merging?
2.

@conorsch conorsch self-requested a review April 2, 2021 01:20
Copy link
Contributor

@conorsch conorsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving. Requesting a visual spot-check from @kushaldas, since I pushed a few changes on top.

@kushaldas
Copy link
Contributor Author

@conorsch I followed the steps, and everything looks okay for the bootstrapping part of #218. Thank you for the review and the fixes :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants