-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests wheel reproducibility against version control #243
Conversation
CI is not passing: https://circleci.com/gh/freedomofpress/securedrop-debian-packaging/8074 likely due to #231. Marking as draft until resolved. |
yeah, #231 is acting sus, however, i can't repro. I get the same reproducible wheel on different machines with different build roots :/ |
We were already checking that building a wheel twice yields the same checksum both times, but we weren't confirming that the checksum matched what's already committed to version control. Let's do that. Still leaving the 'reprotest' check enabled because it supports multiple methods for checking reproducible, which should help us shake out problems like #231.
1842e85
to
9a0cee5
Compare
Will take a closer look at CI now that #246 is merged. At a glance, the CI run times are a good 10m longer or so now. Perhaps we should skip the old |
Now we are using the Debian provided Python to create the virtualenv. Means the wheels will also be created using the system python, and will have the exact same sha256sums. By default the CircleCI images are based on the docker Python images and they install python from building the source.
While trying to identify the differences between the wheels generated via the CI and our wheels,
If I look at the Dockerfile for Now that image shows that it installs Python executable build from source via the https://python.org But, all of our wheels are bulit via the system python provided by Debian Buster. I now added another commit, where we are using the Debian provided Python. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a glance, the CI run times are a good 10m longer or so now. Perhaps we should skip the old reprotest test for wheels and only use this new test, which effectively verifies reproducibility by comparing to vc.
Yes, I think removing the old reprowheels test is okay. We are verifying against the lfs
checked in version here in this new code.
@conorsch after you mark this PR as ready for review, we can do the formal review and merge this.
tests/test_reproducible_wheels.py
Outdated
repo_url = f"https://github.com/freedomofpress/{repo_name}" | ||
build_cmd = f"./scripts/build-sync-wheels -p {repo_url} --clobber".split() | ||
subprocess.check_call(build_cmd) | ||
check_cmd = "git diff --exit-code".split() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is simple.
The version-control comparison effectively verifies that our wheels are byte-for-byte reproducible, and also ensures that we're tracking 100% of dependencies in version-control already, since the
Ready for final review. See the most recent commits, @kushaldas, which disable the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good. Approved.
In future we may want to have a night repro test for the wheels.
subprocess.check_call("git diff --exit-code".split()) | ||
# Check for new, untracked files. Test will fail if working copy is dirty | ||
# in any way, so mostly useful in CI. | ||
assert subprocess.check_output("git status --porcelain".split()) == b"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should keep this command in mind. Thank you @conorsch.
We were already checking that building a wheel twice yields the same
checksum both times, but we weren't confirming that the checksum matched
what's already committed to version control. Let's do that.
Still leaving the 'reprotest' check enabled because it supports multiple
methods for checking reproducible, which should help us shake out
problems like #231.
Closes #241.
Testing
pytest -k test_wheel_builds_match_version_control tests
locally and confirm passingecho this-is-not-a-wheel > localwheels/MarkupSafe-1.1.1-cp37-cp37m-linux_x86_64.whl && gc -m 'TEMPORARY: broken wheel' localwheels/
, then re-run step 2, and confirm the test fails.Step 3 ensures that if a wheel build emits a different checksum from what's already version controlled, CI will fail on it.