-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor param_dict to config #1008
Conversation
deepspeed/__init__.py
Outdated
@@ -58,11 +58,12 @@ def initialize(args=None, | |||
mpu=None, | |||
dist_init_required=None, | |||
collate_fn=None, | |||
config_params=None): | |||
deepspeed_config=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I think we have had a bit of a miscommunication in #983
This would be a breaking change.
I think it'd be OK for transformers
since we constantly raise the minimal requirement for deepspeed, but is this the case for other DeepSpeed users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a feeling, lots of concepts were discussed :)
If we like the change but mind BW compat, I can re-add the config_params
parameter, setting it to deepspeed_config
(optionally I'll print a warning message, if we like). cc @tjruwase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My feeling is that they haven't used deepspeed_
prefix in the first place, because it's all deepspeed and thus doesn't require a self-reference ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha that's a great point! maybe config_params
should be kept as is then? Just internally we've renamed?
I'd prefer just config
if I had to choose though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, config params repeats itself twice, but it was pointing to the specific version of the config.
I agree config
would be the cleanest, if it is multi-functional - i.e. any format.
Well, since the deepspeed team trusts us to propose to mess with their API, I hope it won't be overstepping if you made it config
and left config_params
as a back-compat deprecated arg name.
But personally I'm totally fine with config_params
since it's only one place where it's used.
Internally there is no back-compat issue I guess, so if it makes the code more intuitive then it's probably a good choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so with that said, config
will be the new name, and config_params
will remain as BW compat, let's see if @tjruwase is cool with it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks guys for driving this conversation. Yes, we have avoided using deepspeed_
because of the self-reference. I like the proposal of config
for new name and config_params
for BW compat. I will also ping the rest of the folks here for awareness of the changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also a fan of config
for the new name!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me as well :)
Thanks everyone for pitching in :) should be good to go, let me know if there is anything that needs addressing! |
@@ -122,6 +126,7 @@ def initialize(args=None, | |||
mpu=mpu, | |||
dist_init_required=dist_init_required, | |||
collate_fn=collate_fn, | |||
config=config, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do DeepSpeedEngine
and PipelineEngine
still need config_params
though? Are those public APIs or internal? If they are internal then perhaps it's enough to alias config_params
to config
in initialize
and adjust those 2 classes to receive just config
?
PipelineEngine
doesn't need to be changed I see, as it uses super-class to get the config, so just DeepSpeedEngine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was contemplating this, if preferred I can move the logic to do this into the initialize
function instead of doing it internally! Wanted to keep initialize
as clean as possible, but didn't know which would be preferred
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, this is perfectly fine then. This is good. Thank you!
As discussed in #983, allows
deepspeed_config
to be a file path or a dictionary when passed into theInit
function orinitialize
function. This will be a breaking change for people usingconfig_params
however should allow cleaner parsing in the future of all the parameters.cc @stas00 @tjruwase