-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API DESIGN REVIEW] Keras Input Tensor API #7102
Comments
Correct me if my summary is wrong, but it looks like the TL;DR is:
Existing option: # tf yield ops that supply dataset images and labels
x_train_batch, y_train_batch = read_and_decode_recordinput(...)
# create a basic cnn
x_train_input = Input(tensor=x_train_batch)
x_train_out = cnn_layers(x_train_input)
model = Model(inputs=x_train_input, outputs=x_train_out)
loss = keras.losses.categorical_crossentropy(y_train_batch, x_train_out)
model.add_loss(loss)
model.compile(optimizer='rmsprop', loss=None) Proposed option: # tf yield ops that supply dataset images and labels
x_train_batch, y_train_batch = read_and_decode_recordinput(...)
# create a basic cnn
x_train_input = Input(tensor=x_train_batch)
x_train_out = cnn_layers(x_train_input)
# batch tensors are added to an Input layer
# Perhaps this aspect of API usage can be improved?
y_train_input = Input(tensor=y_train_batch)
# This is where the label op is supplied so a placeholder isn't created
x_out = Target(y_train_input)(x_train_out)
model = Model(inputs=x_train_input, outputs=x_out)
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy']) I'll let others comment on pros/cons... |
Unless you convince me otherwise, my take on it is that we should:
|
The above works for me! However, I have one last idea... What about re-compiling on demand when tensors are passed to API usage, with working mnist_tfrecord.py implementation: # tf yield ops that supply dataset images and labels
x_train_batch, y_train_batch = read_and_decode_recordinput(...)
x_train_in = Input(tensor=x_train_batch)
x_train_out = cnn_layers(x_train_in)
y_train_in = Input(tensor=y_train_batch, name='y_labels')
train_model = Model(inputs=[x_train_in], outputs=[x_train_out])
train_model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
train_model.fit(None, y_train_in,
batch_size=batch_size,
epochs=epochs) Working Model() implementation: class Model(Container):
# In this version, `compile()` always saves the arguments in case
# they're needed later. I think the overhead should be acceptable:
def compile(self, *args, **kwargs):
# ...snip docs...
self._saved_kwargs = kwargs
self._saved_args = args
# ...snip rest of compile()...
# If any fit() parameters are tensors, we dynamically recompile:
def fit(...)
# ...snip docs and legacy support check...
# Proof of concept, will need some tweaks
# expect_other_types=True means is_keras_tensor
# will return False on a numpy array, not throw an exception.
if K.is_keras_tensor(y, expect_other_types=True):
self.target_configuration = [y]
y = None
self._compile(*self._saved_args, **self._saved_kwargs)
elif y is not None:
recompile = False
self.target_configuration = []
for i, yi in enumerate(y):
if K.is_keras_tensor(yi, expect_other_types=True):
self.target_configuration.append(yi)
y[i] = None
recompile = True
else:
self.target_configuration.append(None)
if recompile:
self._compile(*self._saved_args, **self._saved_kwargs)
# ...snip rest of fit()... What do you think? I like this proposal much better than my previous ones, so I've merged it into my largest PR, #6928, replacing the previous behavior. |
I like the recompiling on-demand idea. I don't think tensorflow will have transparent graph surgery any time soon. Some questions:
|
@TimZaman The current implementation is the intermediate step where y can be specified at fit time, full x,y is not yet possible, since it requires transparent graph editing. Item 2 is actually the answer to item 1, because the x input is defined by the input tensor, so there is no reason to define it again. Additionally, in this case there are no placeholders because a placeholder requires going through python with the implied performance hit. Item 3 is due to tensorflow's design. Since the input tensor is how input is supplied, you must save the current graph weights, and then create a new graph with different input tensors, or with placeholders if you go with pure python, and load the saved weights before making predictions. Eventually, when graph editing is implemented with a clear and usable API, the placeholders can be replaced with input tensors after being created. However since at this time creating an input layer without a tensor parameter will create a placeholder, we must supply the input tensor when the input layer is created. Was my explanation clear? |
Update: #7113 has a proper input tensor example with external loss. |
Closing in favor of #7503, which has the finalized action plan. |
Keras Input Tensor API Design Proposal
Executive Summary
Tensors that input data to a Model will run faster, add dataset formats, improve usability, and reduce backend lock-in.
I’d appreciate if you make suggestions and give feedback! It is currently a draft, and comments can be made directly on the Google Doc for the full proposal.
As this is the very first Keras API design review, please be kind. :-)
Thanks for considering my proposal and thanks to those who will become a reviewer or contributor!
HELP WANTED
I need help from CNTK & Theano experts on how those backends might be affected by this proposal. Feel free to add comments directly that can be discussed and incorporated.
REVIEW THE REVIEW OR CREATE A PROPOSAL
Please also either give feedback on the process itself or make your own proposal via the Keras API Design Review Template.
p.s. plaintext link:
https://docs.google.com/document/d/1tf2Nl7wor8rmWPUoxfClLuPLQGqvZryegD7K7-1tTe8/edit?usp=sharing
The text was updated successfully, but these errors were encountered: