Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: update structure of wheel storage #3591

Closed
wants to merge 2 commits into from
Closed

ci: update structure of wheel storage #3591

wants to merge 2 commits into from

Conversation

raunakab
Copy link
Contributor

Overview

Previously, wheels were stored in AWS S3 as: s3://bucket/builds/$COMMIT_HASH/$WHEEL_NAME. The wheel name would be some name following the PEP 427 standard. You can have multiple wheels stored inside of /builds/$COMMIT_HASH.

Having multiple wheels stored inside of the /builds/$COMMIT_HASH meant that you would need to employ some additional logic to figure out:

  • the wheel name for the platform on which you want to run on
  • which version of python you want to run with
  • which ABI you want
  • etc..

I.e., I'm running on Ubuntu 22.04 with glibc-v2.34, on an x86 machine. Therefore, I need to find a wheel inside of /builds/$COMMIT_HASH that has x86 and manylinux_2_34 in it.). This logic can be cumbersome.

This PR proposes a new structure: s3://bucket-name/builds/$COMMIT_HASH/$GHA_RUN_ID_$GHA_RUN_ATTEMPT/$WHEEL_NAME. Inside of /builds/$COMMIT_HASH/$GHA_RUN_ID_$GHU_RUN_ATTEMPT, there will always be at most 1 wheel. Using this new structure, you will not need to know the wheel-name; if you know the $COMMIT_HASH, the $GHA_RUN_ID and the $GHA_RUN_ATTEMPT, you will always be able to locate the wheel (without having to perform any wheel-name logic).

Note

This is a non-breaking change. Other workflows only grab the URL of a wheel, they don't observe the structure. Therefore, updating the structure should not affect the other workflows.

@raunakab raunakab requested a review from jaychia December 17, 2024 20:10
@github-actions github-actions bot added the ci label Dec 17, 2024
@raunakab raunakab marked this pull request as ready for review December 17, 2024 20:13
@raunakab
Copy link
Contributor Author

@jaychia This is just a proposition. The new structure may be a bit too "centered" around GitHub Actions.

Maybe the old structure is nice enough?

Copy link

codspeed-hq bot commented Dec 17, 2024

CodSpeed Performance Report

Merging #3591 will degrade performances by 34.62%

Comparing ci/build-commit (52345aa) with main (47f5897)

Summary

⚡ 1 improvements
❌ 1 regressions
✅ 25 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main ci/build-commit Change
test_iter_rows_first_row[100 Small Files] 215 ms 159.9 ms +34.42%
test_show[100 Small Files] 15.5 ms 23.7 ms -34.62%

Copy link
Contributor

@jaychia jaychia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me, I'm guessing this is a prerequisite to doing the if wheel !exists { build_wheel() } else { skip() } logic from run-cluster?

I feel like there should just be a canonical way to represent the wheel names that we're missing though. I.e. we shouldn't need to invent our own naming convention

@raunakab
Copy link
Contributor Author

Closing because not necessary right now.

@raunakab raunakab closed this Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants