ci: Add ability to array-ify args and run multiple jobs #3584

raunakab · 2024-12-16T21:41:48Z

Overview

Previously, the run-cluster workflow only ran one ray-job-submission. This PR extends the ability to be able to run any arbitrary array of job submissions by enabling us to pass an array into the entrypoint_args input param. This then splits the command into its multiple pieces and submits them all.

Example Usage

gh workflow run run-cluster.yaml \
    --ref $current_branch \
    -f working_dir="." \
    -f daft_wheel_url="https://github-actions-artifacts-bucket.s3.us-west-2.amazonaws.com/builds/54428e3738e96764af60cfdd8a0e4a41717ec9f9/getdaft-0.3.0.dev0-cp38-abi3-manylinux_2_31_x86_64.whl" \
    -f entrypoint_script="benchmarking/tpcds/ray_entrypoint.py" \
    -f entrypoint_args="[\"--tpcds-gen-folder='gendata' --question='1'\", \"--tpcds-gen-folder='gendata' --question='2'\"]"

The above invocation runs TPC-DS queries 1 and 2.

.github/ci-scripts/job_runner.py

raunakab · 2024-12-16T21:46:59Z

There are some hardcoded values in the runner script (that is run on the runner node, not the ray-head node).

However, we don't have a mechanism to pass values along to the runner script. I'm thinking of creating one named runner_scripts_args. Thoughts @jaychia?

codspeed-hq · 2024-12-16T22:03:23Z

CodSpeed Performance Report

Merging #3584 will degrade performances by 37.6%

_{Comparing ci/run-cluster (5b5a9f9) with main (e148248)}

Summary

❌ 1 regressions
✅ 26 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`main`	`ci/run-cluster`	Change
❌	`test_iter_rows_first_row[100 Small Files]`	142.9 ms	229.1 ms	-37.6%

jaychia

Should not be generating TPC-DS data for every run-cluster command.

Not sure what's going on there, also you're generating this on the runner, but we need the data to be available in S3?

benchmarking/test.py

benchmarking/tpcds/ray_entrypoint.py

.github/workflows/run-cluster.yaml

.github/ci-scripts/job_runner.py

raunakab · 2024-12-16T22:29:07Z

Example of a successful run:
https://github.com/Eventual-Inc/Daft/actions/runs/12361707341

codecov · 2024-12-16T23:09:10Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.82%. Comparing base (6c21917) to head (ff89642).
Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3584      +/-   ##
==========================================
+ Coverage   77.79%   77.82%   +0.02%     
==========================================
  Files         716      716              
  Lines       87991    88243     +252     
==========================================
+ Hits        68455    68673     +218     
- Misses      19536    19570      +34

see 3 files with indirect coverage changes

.github/workflows/run-cluster.yaml

jaychia

Can we point to a successful job run as well?

.github/ci-scripts/job_runner.py

.github/workflows/run-cluster.yaml

raunakab · 2024-12-17T19:45:14Z

Example of a successful run (run on just 1 argument, --question=3 --scale-factor=100):
https://github.com/Eventual-Inc/Daft/actions/runs/12366475502

raunakab · 2024-12-17T20:12:27Z

Example of a successful run (run on multiple arguments, ["--question=1 --scale-factor=100", "--question=3 --scale-factor=100"]):
https://github.com/Eventual-Inc/Daft/actions/runs/12380369256

github-actions bot added the ci label Dec 16, 2024

Array-ify the run-cluster script

735b16c

raunakab force-pushed the ci/run-cluster branch from 019d354 to 735b16c Compare December 16, 2024 21:43

raunakab requested a review from jaychia December 16, 2024 21:43

raunakab commented Dec 16, 2024

View reviewed changes

.github/ci-scripts/job_runner.py Outdated Show resolved Hide resolved

.github/ci-scripts/job_runner.py Outdated Show resolved Hide resolved

jaychia requested changes Dec 16, 2024

View reviewed changes

raunakab added 3 commits December 16, 2024 14:37

Remove test file

05c51ab

Change to raising errors instead of performing assertions

a744cf2

Remove debug prints

96ec531

raunakab added 6 commits December 16, 2024 15:37

Add catalog generation from s3 instead of from local

46ebcab

Remove data-gen

b3da1a3

Remove extra variable

e936e9f

Change up catalog registration

5c14913

Generate catalog off of s3 urls instead

b96aed2

Remove the removal of the daft dir

dbe5f8c

raunakab requested a review from jaychia December 17, 2024 00:36

raunakab added 2 commits December 16, 2024 16:42

Add scale-factor argument

54e2269

Add default scale-factor size

8ed2077

raunakab commented Dec 17, 2024

View reviewed changes

.github/workflows/run-cluster.yaml Outdated Show resolved Hide resolved

Remove duckdb dep

674ca8f

jaychia approved these changes Dec 17, 2024

View reviewed changes

.github/ci-scripts/job_runner.py Show resolved Hide resolved

.github/workflows/run-cluster.yaml Show resolved Hide resolved

Add inline metadata to job_runner script

debb5b8

raunakab added 3 commits December 17, 2024 11:45

Edit description

92f227c

Remove trailing slash

5b5a9f9

Merge branch 'main' into ci/run-cluster

ff89642

raunakab enabled auto-merge (squash) December 17, 2024 20:12

raunakab merged commit b7ea62b into main Dec 17, 2024
40 of 41 checks passed

raunakab deleted the ci/run-cluster branch December 17, 2024 20:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: Add ability to array-ify args and run multiple jobs #3584

ci: Add ability to array-ify args and run multiple jobs #3584

raunakab commented Dec 16, 2024

raunakab commented Dec 16, 2024

codspeed-hq bot commented Dec 16, 2024 •

edited

Loading

jaychia left a comment

raunakab commented Dec 16, 2024

codecov bot commented Dec 16, 2024 •

edited

Loading

jaychia left a comment

raunakab commented Dec 17, 2024

raunakab commented Dec 17, 2024

ci: Add ability to array-ify args and run multiple jobs #3584

ci: Add ability to array-ify args and run multiple jobs #3584

Conversation

raunakab commented Dec 16, 2024

Overview

Example Usage

raunakab commented Dec 16, 2024

codspeed-hq bot commented Dec 16, 2024 • edited Loading

CodSpeed Performance Report

Merging #3584 will degrade performances by 37.6%

Summary

Benchmarks breakdown

jaychia left a comment

Choose a reason for hiding this comment

raunakab commented Dec 16, 2024

codecov bot commented Dec 16, 2024 • edited Loading

Codecov Report

jaychia left a comment

Choose a reason for hiding this comment

raunakab commented Dec 17, 2024

raunakab commented Dec 17, 2024

codspeed-hq bot commented Dec 16, 2024 •

edited

Loading

codecov bot commented Dec 16, 2024 •

edited

Loading