Rectify Missing Dataloader Preparation Call in PaddingFree Plugin Method #63

achew010 · 2024-08-06T09:26:44Z

Description

This PR fixes a missing dataloader preparation call in the PaddingFree Plugin that will cause num optimization steps in distributed experiments to be computed wrongly.

With this fix, the relative performance improvement between padded vs padding-free approximately matches for

FMS-Acceleration PaddingFree plugin, 24%
Natively supported padding free in Transformers, 26%

ORCA Math 20K Subset - 1 Epoch

Transformers

Model	DataProcess	Grad Accum	Per Device Batch Size	Num Devices	Time	Throughput (token/s) / device
Mistral-7B	Padded	2	4	8	812	1216
Mistral-7B	Transformers Native PaddingFree	2	4	8	596	1658

FMS-Acceleration Padding-Free Plugin

Model	DataProcess	Grad Accum	Per Device Batch Size	Num Devices	Time	Throughput (token/s) / device
Mistral-7B	Padded	2	4	8	818	1208
Mistral-7B	PaddingFree Plugin	2	4	8	621	1568

Signed-off-by: 1000960000 user <[email protected]>

fabianlim · 2024-08-06T13:34:12Z

Whats the difference between "Padded-Longest" and "Transformers Padded"? If there is no difference please edit the description to make it consistent

achew010 requested a review from fabianlim as a code owner August 6, 2024 09:26

fixes to data-collator patching for padding-free plugin

8a3e8bf

Signed-off-by: 1000960000 user <[email protected]>

achew010 force-pushed the fix-padding-free-collator branch from 9d1cb39 to 8a3e8bf Compare August 6, 2024 09:33

fabianlim approved these changes Aug 6, 2024

View reviewed changes

fabianlim merged commit b159600 into foundation-model-stack:main Aug 6, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rectify Missing Dataloader Preparation Call in PaddingFree Plugin Method #63

Rectify Missing Dataloader Preparation Call in PaddingFree Plugin Method #63

achew010 commented Aug 6, 2024 •

edited

Loading

fabianlim commented Aug 6, 2024 •

edited

Loading

Rectify Missing Dataloader Preparation Call in PaddingFree Plugin Method #63

Rectify Missing Dataloader Preparation Call in PaddingFree Plugin Method #63

Conversation

achew010 commented Aug 6, 2024 • edited Loading

Description

ORCA Math 20K Subset - 1 Epoch

Transformers

FMS-Acceleration Padding-Free Plugin

fabianlim commented Aug 6, 2024 • edited Loading

achew010 commented Aug 6, 2024 •

edited

Loading

fabianlim commented Aug 6, 2024 •

edited

Loading