Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-package] fix mypy errors about early stopping rounds #5795

Merged
merged 2 commits into from
Mar 30, 2023

Conversation

jameslamb
Copy link
Collaborator

Contributes to #3867 .

Fixes the following mypy errors.

engine.py:214: error: Argument "stopping_rounds" to "early_stopping" has incompatible type "Optional[Any]"; expected "int"  [arg-type]
engine.py:705: error: Argument "stopping_rounds" to "early_stopping" has incompatible type "Optional[Any]"; expected "int"  [arg-type]

mypy is rightly complaining that there's no guarantee that params["early_stopping_round"] will contain an integer here:

if "early_stopping_round" in params:
callbacks_set.add(
callback.early_stopping(
stopping_rounds=params["early_stopping_round"],

However, we do protect against that on the C++ side.

from sklearn.datasets import make_regression
import lightgbm as lgb

X, y = make_regression()
dtrain = lgb.Dataset(X, label=y)
bst = lgb.train(
    params={"early_stopping_rounds": "too-many"},
    train_set=dtrain,
)

# lightgbm.basic.LightGBMError: Parameter early_stopping_round should be of type int, got "too-many"

So this PR proposes:

  • silencing those errors with # type: ignore comments
  • adding a unit test to enforce that lightgbm continues raising that informative error
    • found via git grep 'should be of type' that there wasn't currently one

We could, alternatively, address these errors by doing something like the following on the Python side:

def _integer_value_from_params(name: str, params: Dict[str, ny]) -> int:
    """
    Extract an integer value from ``params``, or raise an error f it can't be cast to integer.

    This method intentionally does not try to resolve parameter liases with e.g.
    ``_choose_param_value()``, to avoid unnecessary duplicate rocessing.
    """
    val = params[name]
    if isinstance(val, (int, float)):
        return int(val)
    else:
        raise ValueError(f"Could not cast value '{val}' for parameter '{params}' to integer.")

# ...
    stopping_rounds = _integer_value_from_params("early_stopping_round", params)

But I think it's preferable to do that validation only on the C++ side, so that all LightGBM wrappers benefit from it and to limit duplication of logic between the Python side and C++ side.

@jameslamb jameslamb merged commit 9fce6b8 into master Mar 30, 2023
@jameslamb jameslamb deleted the ci/mypy-param-types branch March 30, 2023 02:48
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants