Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tensor parallelism related mappings #15360
Add tensor parallelism related mappings #15360
Changes from all commits
ec54d81
e79ff07
0d50c47
04089dc
7353d3e
bec6a84
dc7e74f
6086772
320a6e0
ded2fd7
5e9b0ef
c671435
cb724a9
844d27c
fb4af90
60ba049
b6ea797
958003b
2f3230d
b9e9ace
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a docstring here explaining what this class does.
Also, the name is unclear to someone who is not familiar with all the model parallelism jargon. Let's expand to
TensorParallelismInfo
as we usually use descriptive names in Transformers.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fits in one line (our char limit is 119).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same remark as above about naming. Let's go for the full
Column
here, there is no point sparing three characters :-)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here
TensorParallelismMapping
. We also need a docstring to explain what this does and how to expand it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second test should not be necessary. We don't have any
PretrainedModel
without the capital T (except for bart, but the class without the capital is deprecated and we have a class with the capital).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sgugger Can you let me know which version this was applied from? The users of the lower transformers version still need this in the OSLO.
To add a little more, the transformers will also have this mapping, but the OSLO will have it internally as well. This is because this mapping class does not exist for users of lower versions of the transformers. Users with lower versions will use the mapping inside OSLO and users with higher versions will use the mapping from transformers. So this check is not required in transformers, but still required in OSLO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We prefer going for a test and raising an error in Transformers:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for correction! I think
if cls is None
is more correct. I'll apply your suggestion.