Ensure Consistency Between GPTConfig.block_size and Sequence Length T #72
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
First and foremost, I want to express my appreciation for this tutorial. It's incredibly insightful and well-structured.
I'm submitting this PR because I noticed a potential issue related to
GPTConfig.block_size
not being enforced to match the sequence lengthT
.If I understand correctly, this discrepancy could lead to unexpected model behavior during inference if
T
is lower thanGPTConfig.block_size
. (Note that an assertion error is already raised whenT
exceedsGPTConfig.block_size
, as seen here).Thank you for considering this change. Please let me know if any further adjustments are needed.