-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parameter routing #50
Comments
It'd be nice if the def build_keras_model(hidden_dim=10, activation="sigmoid"):
...
return model # uncompiled model; compile happens inside BaseWrapper?
est = KerasClassifier(
model=build_keras_model,
model__hidden_dim=20,
model__activation="relu",
optimizer="sgd",
optimizer__lr=0.01,
optimizer__momentum=0.9,
loss="categorical_crossentropy",
batch_size=256,
validation_frac=0.2,
) This almost exactly mirrors the Skorch API. |
I agree that
To be clear (if I understood their docs correctly), |
I think it would be nice to have that flexibility. But I also want to support the existing interface (i.e. compile inside |
Copied from #47: A quite common pattern in scikit-learn estimators is to parse/modify an def fit(X, y, ...):
self._random_state = check_random_state(self.random_state) Where downstream the estimator will only use the I think it would be nice to allow users to do something like: class MyClassifier(KerasClassifier):
def hook_params(self): # in base class this would just be a pass
if self.batch_size is None:
self.batch_size_ = 32
else:
self.batch_size_ = self.batch_size But the trick will then be routing that parameter so that def build_fn(batch_size=32):
print(f"batch_size={batch_size}")
...
class MyClassifier(KerasClassifier):
def hook_params(self): # in base class this would just be a pass
if self.batch_size is None:
self.batch_size_ = 32
else:
self.batch_size_ = self.batch_size
self.batch_size__ = 64
estimator = MyClassifier()
estimator.fit(X, y) # always prints `batch_size=64` |
👍 that sounds like a good solution. I expect SciKeras's API to break; it's still a pretty young library.
Is there a way to check if a Keras model is compiled? I think it should be separate from this PR too.
That's what it looks like. It looks like |
Just to wrap up this discussion: if there isn't a way to check this directly, I'm sure we can build something to do that check pretty easily. I think all that |
With all of the huge improvements within the last couple of weeks:
I think we are in a place to tackle parameter routing. I would like to make this the next priority for this library. |
These 7 parameters are accepted by BaseWrapper but not used:
Which of these parameters be set after model compilation? It appears It might make more sense to have I still like to see the API in #50 (comment). |
Of those, the following correspond to
I think the only issue with what you are proposing is that it will not be backwards compatible. I think what we can do is have a separate function that calls |
Can models be compiled twice? As a user, I would expect that changing BaseWrapper's parameters influences the behavior. I think there should be a deprecation period. At the very least, there should be a warning if an uncompiled model is returned. |
But how do we distinguish modified parameters from defaults? The user could
set `loss="mse"` in `build_fn` and we would overwrite that with `loss=None`
from `__init__`.
I am also against _only_ allowing compilation by the wrappers since it
raises many other issues, like being able to pass a loss function that was
calculated in `build_fn` to `_compile_model` (or whatever it gets called).
|
The only use case that my proposal of "compile only if uncompiled" does not cover is where the user is using a |
I think I'd rather have one clear and obvious way to use the library: "There should be one-- and preferably only one --obvious way to do it" (PEP 20). I'm not seeing the need to support both returning compiled and uncompiled models (at least not after any deprecation period; see below). Returning an uncompiled modelI think basic usage would be covered, certainly for my needs and customization of The advanced usage in #50 (comment) would require inheritance and overwriting class CustomLoss(KerasClassifier):
def compile(self):
loss = custom_fn(**self.meta_, **self.get_params())
return model.compile(loss=loss, optimizer=self.optimizer_, callbacks=self.callbacks) I'd show at least one example with use-case in mind. Returning a compiled modelIn this case, all compile parameters are passed to I think most of the documentation should mention this example: def build_fn(n_hidden=10, loss="mse", **kwargs):
model = ...
model.compile(loss=loss, **kwargs)
return model I'd prominently show passing I prefer the first method, returning an uncompiled model. It'd easily work without subclassing for the simple use cases, and better fits with the object-oriented approach. For advanced use cases, it allows accessing other attributes of I think a deprecation period should include the "compile if uncompiled" as mentioned, with a warning raised if the model is compiled. |
Thank you for the detailed overview! I think that I need to mull over this for a bit, but I would like to share that my use case is a bit different. My original goal for even trying to use I hope the use case helps understand where my hesitation comes from. |
I'm pretty ambivalent on the choices in #50 (comment), and can see the argument for returning a compiled model. When an uncompiled model is returned and the user wants to override The complete interface I'm envisioning is this: def model_build_fn(n_hidden=10, meta=None, params=None, optimizer="sgd", **kwargs):
model = ...
model.compile(optimizer=optimizer, **kwargs)
return model I think |
Thank you for the concrete proposal @stsievert I like that idea of passing dictionaries instead of parameters directly, but I'm hesitant to give special treatment to Just to confirm, in your example How about an interface like this? class BaseWrapper(BaseEstimator):
def __init__(
self,
build_fn=None,
*,
random_state=None,
optimizer="rmsprop",
...,
**kwargs
):
self.optimizer = optimizer
...
vars(self).update(**kwargs)
def route_params(self, destination_prefix: str, params: Dict[str, Any]):
# here use `destination_prefix` to override `optimizer` with `compile_optimizer`
# if the destination_prefix is `compile_`
# and apply any other parameter routing/overriding logic we want def model_build_fn(meta=None, params=None):
print("build got" + params["optimizer"])
print("build got" + params.get("compile_optimizer", None))
print("build got" + params["buildparam"])
print(" build got " + meta["n_featues_in_"])
...
return model
estimator = KerasRegressor(
model_build_fn,
compile__optimizer="sgd", # override optimizer param for `compile`
__optimizer="adam" # override the global optimizer param by preappending __
build__buildparam="test" # route a new param to `build_fn`
)
estimator.fit(...)
# prints "build got adam"
# prints "build got None"
# prints "build got test"
# prints "build got 2" (or whatever estimator.n_features_in_ is)
estimator.model_.optimizer == "sgd" # True This way we also don't have to introspect into If the user tries to access a parameter that they did not pass to the wrapper's initializer (ex: they forgot to put in the |
Then let's fix that. Here's a revised interface with some documentation: def model_build_fn(n_hidden=10, meta=None, params=None, compile=None):
"""
n_hidden : int
User-defined parameter for number of hidden units.
meta : dict
Describes the number of input features, outputs, classifier type, etc.
params : dict
The parameters of `BaseWrapper`, also available through `BaseWrapper.get_params`.
compile : dict
Valid parameters for `model.compile`. This means `model.compile(**compile)` can be used.
This dictionary includes keys for `loss`, `optimizer`, which are instantiated Keras loss functions
and optimizers only if `optimizer__foo` is passed (e.g., `optimizer__momentum` for `optimizer="sgd"`).
Otherwise, the `loss` and `optimizer` values are strings.
"""
model = build_model(n_hidden)
model.compile(**compile)
return model I think this interface is simple and easy-to-use, and it also supports advanced usage:
I think I'd be okay reverting #22 to make the documentation simpler. |
This does look better! Would this replace #66, or would it be in-addition to?
Do you mean that you would be okay reverting #22 so that there is only one way to declare tunable paramters for And just because you brought up |
Maybe "compile if uncompiled" is the best choice; it allows the user the most flexibility and provides a good default option (returning an uncompiled model and relying on BaseWrapper's parameters). It has pretty clear documentation.
No. I think tunable parameters should still be able to be used without passing them to the initializer. But that might require some customization of from skorch import NeuralNetClassifier
import torch
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import make_classification
class Mod(torch.nn.Module):
def __init__(self, features=80, hidden=3):
super().__init__()
self.l1 = torch.nn.Linear(features, hidden)
self.l2 = torch.nn.Linear(hidden, 1)
def forward(self, x):
return torch.sign(self.l2(self.l1(x)))
model = NeuralNetClassifier(Mod, max_epochs=1, criterion=torch.nn.MSELoss)
# `hidden` or `module__hidden` is not a model parameter
assert all("hidden" not in param for param in model.get_params())
params = {"module__hidden": [3, 4, 5]}
search = GridSearchCV(model, params)
X, y = make_classification(n_features=80)
X, y = X.astype("float32"), y.astype("float32")
search.fit(X, y) |
Can you help me understand what use case is not satisfied by having to declare the parameters in the initializer? I know we discussed this back in #22 and #18, so I am sorry if this feels like going backwards. My understanding is that the issue (#18) came from a combination of convenience and lack of documentation. Is the interface in #67 more inconvenient? I think #67 is certainly easier to document and debug. For your example above, #67 would look like: from scikeras import KerasClassifier
from tensorflow import keras
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import make_classification
def build_fn(meta_params, build_params, compile_params):
features = meta_params["features"]
hidden = build_params["hidden"]
...
model = KerasClassifier(build_fn, max_epochs=1, hidden=(100, ))
# `hidden` is a model parameter but `build__hidden` is not
params = {"build__hidden": [3, 4, 5]}
search = GridSearchCV(model, params)
X, y = make_classification(n_features=80)
X, y = X.astype("float32"), y.astype("float32")
search.fit(X, y)
Is this the "disconnect" you mentioned above? All of this said, it would not be impossible to mix #67 with introspection into the arguments of |
I'll walk through one use case. Let's say I want to fine-tune a SciKeras model that a colleague has sent me. They relied on defaults for their activation function, so they didn't set import pickle
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import make_classification
with open("email-attachment.pkl", "rb") as f:
model = pickle.load(f)
params = {"model__activation": ["relu", "prelu", "leaky_relu", "relu6", "selu", "celu"]}
search = GridSearchCV(model, params)
X, y = make_classification(n_features=80)
X, y = X.astype("float32"), y.astype("float32")
search.fit(X, y) # will fail because `model__activation` is not a valid parameter. Skorch can handle this case. |
Yes I understand that, that is for the "routed" parameters. I agree that that is a good use case and we can work around that. It does not require introspection because we can from the prefix What I was more specifically asking about was the requirement to be able to do: def build_fn(hidden=(100, ), meta_params, build_params, compile_params):
features = meta_params["features"]
...
model = KerasClassifier(build_fn, max_epochs=1) vs. def build_fn(meta_params, build_params, compile_params):
features = meta_params["features"]
hidden = build_params["hidden"]
...
model = KerasClassifier(build_fn, max_epochs=1, hidden=(100, )) |
👍
I think the introspection in #22 should be removed. I do not think it's necessary, and is also confusing for more advanced usage like
I think you're asking about forcing the signature of |
Yes, exactly. I would like to force that signature and require users to declare all non-routed defaults to |
I don't see why forcing the user to conform to a particular signature is necessary, or how it makes SciKeras easier to use. I think the model building function should be able to be used independently outside of SciKeras. I think I think I would support forcing a particular signature only if inheritance/subclassing were used. But then it'd be possible to access all of But, I think most of this is besides the point. def build_model(hidden=10, activation="relu"):
print(hidden, activation)
...
return model
m1 = SciKeras(model=build_model, model__hidden=20)
X, y = ...
m1.fit(X, y) # prints "20, 'relu'"
m2 = m1.set_params(model__activation="prelu")
m2.fit(X, y) # print "20, 'prelu'" This would work with something like this implementation: def fit(self, X, y):
model_kwargs = {k[len("model__"):]: v for k, v in self.get_params() if "model__" in k}
compile_kwargs = {k[len("compile__"):]: v for k, v in self.get_params() if "compile__" in k}
meta_kwargs= {k[len("meta__"):]: v for k, v in self.get_params() if "meta__" in k}
# only send compile/meta args if build_fn signature accepts them
model_ = build_fn(compile=compile_kwargs, meta=meta_kwargs, **model_kwargs)
... |
I see. So instead of forcing |
Exactly. If the user typo'd/etc, Python will raise an exception "TypeError: build_model() got an unexpected keyword argument ...."
I think we should introspect. That's what Dask does – for example, map_blocks will send a |
This sounds reasonable. We would introspect/check for two specific parameters, which I think is easy to understand if anyone has to debug. I'll implement this in #67. |
Implemented in #67. No problems. I'm marking that issue as closing this one (if/when it gets merged). |
As per #37 (comment), it might be nice to have some parameter routing and/or renaming
build_fn
. I'd support this interface:This mirrors the interface that Skorch has. They support overriding some keywords like
optimizer__lr
withlr
, oriteratior_valid__batch_size
anditerator_train__batch_size
withbatch_size
: https://skorch.readthedocs.io/en/stable/user/neuralnet.html#batch-sizeThe text was updated successfully, but these errors were encountered: