-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python] preserve None in _choose_param_value() #5289
Conversation
if main_param_name in params.keys(): | ||
for param in aliases: | ||
params.pop(param, None) | ||
return params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this the same as lines 414-417?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If main_param_name in params.keys():
is true in this block, it means "a value was provided via an alias". This needs to be here to clear out the other aliases.
That's different from lines 414-417, which is like "a value was provided via the main parameter name".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. Maybe after not finding the main_param_name
we could do the following:
aliases_in_params = list(params.keys() & aliases)
if aliases_in_params:
# if any alias was found, set the value of the first one
params[main_param_name] = params[aliases_in_params[0]]
# remove remaining aliases
for alias in aliases_in_params[1:]:
params.pop(alias, None)
return params
I'm approving since this is just a suggestion. Thanks for the explanation!
if found_value is None and val is not None: | ||
found_value = val | ||
# if main param name was not found, search for an alias | ||
for param in aliases: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably for a different PR but shouldn't we try to match the behavior of the C++ side with respect to aliases and try to sort them using the same criteria than here?
LightGBM/include/LightGBM/config.h
Lines 1141 to 1144 in fc0c8fd
if (alias_set != tmp_map.end()) { // alias already set | |
// set priority by length & alphabetically to ensure reproducible behavior | |
if (alias_set->second.size() < pair.first.size() || | |
(alias_set->second.size() == pair.first.size() && alias_set->second < pair.first)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wow, I actually did not know the C++ side did that!
Yes probably would be a good idea to consider that in a future PR. I could see "a different alias got chosen because the order of resolution changed" being a really subtle reproducibility problem that costs users some time, but would want to weigh that against the performance burden of adding more sorting size-checking code to the parameter-resolving stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@StrikerRUS @jmoralez I just created #5304 to track the work of making alias-resolution consistent and reproducible.
Please add any other details there that you think I might have missed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for refactoring this!
I'm +1 for consistent alias choosing method with cpp code.
Co-authored-by: Nikita Titov <[email protected]>
…M into fix/choose-param-value
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Addresses https://github.com/microsoft/LightGBM/pull/5105/files#r889683078.
_choose_param_value()
in the Python package is used to ensure that parameters with multiple possible values are resolved with the following precedence rules:params
params
As noted in the linked comment, that function currently does not preserve a literal
None
in some situations. This PR fixes that.