-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nano: light weight hpo support in nano tensorflow #3712
Comments
nano pytorch hpo support will be described in another Issue. |
Implementation Notes.The Objective to optimize class Objective(object):
def __init__(self, keras_model, model_creator, model_compile, **fit_kwargs):
# make a copy of original model so that next trial can start fresh.
...
def __call__ (trial):
# the objective function for each trial
if self.keras_model is None:
self.keras_model = self.model_creator(trial)
if self.model_compile not defined:
keras_model.compile(...)
else:
self.model_compile(trial, keras_model)
new_fit_args = ... # replace hp args with trial.suggestXXX
target_metric=... # validate the metric settings (e.g. use the first metric if >1 metrics specified)
hist = self.keras_model.fit(**new_fit_args)
score = max(hist.history[target_metric])
return score Changes in bigdl.nano.tf.keras.Sequential and bigdl.nano.tf.keras.Model class bigdl.nano.tf.keras.Sequential:
...
def tune(...,**tune_args,**fit_kwargs):
# determine direction based on common metrics supported
objective=Objective(keras_model, model_creator, model_comple, **fit_kwargs)
self.study = optuna.create_study(direction=direction)
self.study.optimize(objective, n_trials=100)
trial = study.best_trial
print_trial(trial)
def fit(...,use_tune=...):
if use tune:
fit with params using trial_id
else:
original fit routine |
Implementation Notes 2
|
@jason-dai revised design |
Is it possible to support the following API? model = nano.keras.Sequential()
.add(nano.keras.layers.Dense(…))
.add(…)
.add(nano.keras.layers.Softmax(…))
model.compile()
model.tune()
model.fit() input = nano.keras.layers.Input(…)
dense = nano.keras.layers.Dense(…)
…
output = nano.keras.layers.Softmax(…)
model = nano.keras.Model(input, output, …)
model.compile()
model.tune()
model.fit() @nano.automl
class MyCifarResNet(CIFARResNetV1):
def __init__(self, nstage1, nstage2):
nstage3 = 9 - nstage1 - nstage2
layers = [nstage1, nstage2, nstage3]
channels = [16, 16, 32, 64]
super().__init__(CIFARBasicBlockV1, layers=layers, channels=channels)
model= MyCifarResNet(nstage1=space.Int(2, 4),nstage2=space.Int(2,4))
model.compile()
model.tune()
model.fit() |
I think we can support this API.
|
Use |
Updated the above design according to your comments. In addition, we need to consider the case when people don't want to use AutoML at all. Enabling AutoML will automatically decorate all the nano.tf.keras layers and optimizers, there're might be some overhead (construction of AutoObjects, graph traversing) or potential issues (e.g. delayed construction of the keras model inside AutoModel breaks eager exec, etc.)
The first option looks more natural to me. |
Maybe we can check the parameters when the user constructs the layer/model; if no search space is specified, we may directly create a Keras layer/model? |
Short answer is yes, we can. though the code may not be very clean and still there're some extra operations in detecting whether search space is used in arguments. Detailed answer:
class nano.tf.keras.Model(tf.keras.Model):
def __init__(self, ...):
super().__init__(...)
class nano.tf.keras.Model(object):
def __init__(self, ...):
self._internal_m : tf.keras.Model = ...
def fit(...):
...
return self._internal_m.fit(...)
def compile()
.... In case of user don't use automl at all, using the inheritance way seems the proper option. But for automl, the composition option is more convinient and less error prone (we should assusme the automl Model behavior is not exactly the same as original keras model, for example, a user should be able to inspect the model right after Model.compile or build before fitting, while he can't inpect a automodel before fit) A global option of disabling automl can change the two implementation easily. Anyway if we don't want it, we can use a mixture of two, i.e. super() keeps the same behavior as keras and internal model is used for automl if search space if found. |
Looks like keras also provides a tuner: https://www.tensorflow.org/tutorials/keras/keras_tuner |
The implementation is merged so close it. Later may reopen if new features are added. |
Overview
Refer to hpo design in nano pytroch in issue #3925
Enable hyper param tuning in nano tensorflow. Make it as transparent as possible to users. General principals include:
API Design
model.search
search
does not return or save any tuned model, it just collects the statistics of all the trials. Sofit
is still needed aftersearch
. (Notes: maybe we can provide the option to save checkpoints so that fit just get the best checkpoint w/o training)search
automatically search for training related hparams, e.g., batch size, learning rate, etc. The search space is by default inferred and adapted from the input and environment. It can also be explicitly specified in arguments, as the exmaple shown below.search
can be called several times in order to resume the tuning.end_search
and specify the parameters in which trial will be used for the followingfit
. If trial id is not specified in fit, the best trial is selected by default. Ifend_search
is not explicit called, it will be called infit
.search
, while infit
the user can use the full datast (train+val used in search), or use a larger dataset, or more epochs.model.search_summary
search_summary
returns a data frame containing the hparams used in each trial and the value of target metric (and possibly speed or other attributes). Can serve as some sort of a leaderboard.Usage
Basic usage W/O search space configurations:
Just add 1 line of code
search
beforefit
without modification to original code.learning rate
,batch size
will be automatically searched. If search space is not explicitly set, the search space will be inferred automatically.Advanced Usage:
Case 1: Use Sequential to define a Searchable Model
Case 2: Use Functional API to define a Searable Model
Case 3: Ruse a Pre-defined model using customized model
Additional Notes
@jason-dai
The text was updated successfully, but these errors were encountered: