Make example generation work when the model is torch.compile-d #78

IggShaman · 2024-08-26T14:02:10Z

Andrej mentions this issue a couple of times in the video. Apparently, when a model is compiled, it will fix its input and output tensor sizes. The training regime works with inputs of shape (B, T), currently configured to (64, 1024). Example generation works with input tensors of shapes (4, 8) ... (4, 32).
A simple workaround here is to pad example generation tensors to the (B, T) shape, and ignore extra rows. The same workaround applies to the hellaswag eval.
Unfortunately, this increases the computation time linearly to both the B and T, so quadratically overall.

IggShaman mentioned this pull request Aug 26, 2024

torch.compile-d models do not work with example generation and hellaswag eval #79

Open

IggShaman force-pushed the igshevchenko/make_model_compile_compatible_with_example_generation branch from dc054e5 to b50b3da Compare August 26, 2024 14:03

Make example generation work when the model is torch.compile-d

ea398de

IggShaman force-pushed the igshevchenko/make_model_compile_compatible_with_example_generation branch from b50b3da to ea398de Compare August 26, 2024 14:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make example generation work when the model is torch.compile-d #78

Make example generation work when the model is torch.compile-d #78

IggShaman commented Aug 26, 2024

Make example generation work when the model is torch.compile-d #78

Are you sure you want to change the base?

Make example generation work when the model is torch.compile-d #78

Conversation

IggShaman commented Aug 26, 2024