Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Acceleration Patcher and MultiPack Plugin #67

Merged
merged 21 commits into from
Aug 23, 2024
Merged

Conversation

fabianlim
Copy link
Contributor

@fabianlim fabianlim commented Aug 15, 2024

NOTE: this PR to be merged after #66

This PR adds the AccelerationPatcher and the multipack plugin.

  • introduce the AccelerationPatcher into framework to patch the acceleration object.
  • Now AccelerationPatcher only has a replace rule, but in the future we could support other things, such as performing customized distributed loss reductions.
  • Currently the multipack plugin as to be used together with the padding free plugin, because at the moment we do not yet support the non-padding free use case.

Benchmarks

Mistral7B on Flan Subset (6000 samples)

NOTE: this scenarios-flan.yaml is not published as it links to an internally prepared flan.

Run bash scripts/run_benchmarks.sh "2 4 8" "16 32 64" benchmark_outputs scenarios-flan.yaml "padding-free"

  • Verified the token length variance is higher in random sampled batches vs multipack batches
  • PF + Multipack has up to 34% improvement in training runtime compared to PF alone
Framework Config Num Devices Per Device Batch Size Throughput (toks/secs) % Throughput Improvement
aadp-padding-free 2 8 2101 base
aadp-padding-free + multipack 2 8 2470 17.5
aadp-padding-free 4 8 1868 base
aadp-padding-free + multipack 4 8 2498 33.7
aadp-padding-free 8 8 1706 base
aadp-padding-free + multipack 8 8 2509 47
aadp-padding-free 2 16 2533 base
aadp-padding-free + multipack 2 16 2802 10.6
aadp-padding-free 4 16 2304 base
aadp-padding-free + multipack 4 16 2810 21.9
aadp-padding-free 8 16 2010 base
aadp-padding-free + multipack 8 16 2782 38.4

Mistral7B on OrcaMath Subset (8000 samples)
Run bash scripts/run_benchmarks.sh "2 4 8" "16 32 64 128" benchmark_outputs scenarios-orca.yaml "padding-free"

Framework Config Num Devices Per Device Batch Size Throughput (toks/secs) % Runtime Improvement
aadp-padding-free 2 8 2199 base
aadp-padding-free + multipack 2 8 2294 4.3
aadp-padding-free 4 8 2102 base
aadp-padding-free + multipack 4 8 2329 10.7
aadp-padding-free 8 8 1991 base
aadp-padding-free + multipack 8 8 2294 15.2
aadp-padding-free 2 16 2546 base
aadp-padding-free + multipack 2 16 2701 6
aadp-padding-free 4 16 2472 base
aadp-padding-free + multipack 4 16 2698 9
aadp-padding-free 8 16 2433 base
aadp-padding-free + multipack 8 16 2726 12

@fabianlim
Copy link
Contributor Author

@achew010 we have merged #66 now. so I will rebase this PR.

fabianlim and others added 20 commits August 23, 2024 16:21
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: 1000850000 user <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: 1000850000 user <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Co-authored-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: 1000850000 user <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: 1000850000 user <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: 1000850000 user <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: 1000850000 user <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: 1000850000 user <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Co-authored-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: 1000850000 user <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: 1000850000 user <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: 1000850000 user <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Co-authored-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: 1000850000 user <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: 1000850000 user <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: 1000850000 user <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
@fabianlim
Copy link
Contributor Author

fabianlim commented Aug 23, 2024

We have integrated the changes from the PR #71.

Merging but the orca dataset is a bit small for the tests.

@fabianlim fabianlim merged commit 4224c66 into main Aug 23, 2024
6 checks passed
@fabianlim fabianlim deleted the accel-patcher branch August 23, 2024 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants