Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Layer 'shifted_patch_tokenization_10' looks like it has unbuilt state, but Keras is not able to trace the layer call() in order to build it automatically #18763

Closed
pksX01 opened this issue Nov 11, 2023 · 4 comments
Assignees
Labels

Comments

@pksX01
Copy link
Contributor

pksX01 commented Nov 11, 2023

While working on converting the Train a Vision Transformer on small datasets keras.io example to keras 3 with TF backend, I found below issue:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-36-e0427a96856a> in <cell line: 83>()
     81 
     82 # Run experiments with the vanilla ViT
---> 83 vit = create_vit_classifier(vanilla=True)
     84 history = run_experiment(vit)
     85 

2 frames
<ipython-input-35-a479a304ecc6> in create_vit_classifier(vanilla)
      4     augmented = data_augmentation(inputs)
      5     # Create patches.
----> 6     (tokens, _) = ShiftedPatchTokenization(vanilla=vanilla)(augmented)
      7     # Encode patches.
      8     encoded_patches = PatchEncoder()(tokens)

/usr/local/lib/python3.10/dist-packages/keras_core/src/utils/traceback_utils.py in error_handler(*args, **kwargs)
    121             # To get the full stack trace, call:
    122             # `keras_core.config.disable_traceback_filtering()`
--> 123             raise e.with_traceback(filtered_tb) from None
    124         finally:
    125             del filtered_tb

/usr/local/lib/python3.10/dist-packages/keras_core/src/layers/layer.py in _maybe_build(self, call_spec)
   1175                     # Will let the actual eager call do state-building
   1176                     return
-> 1177                 raise ValueError(
   1178                     f"Layer '{self.name}' looks like it has unbuilt state, but "
   1179                     "Keras is not able to trace the layer `call()` in order to "

ValueError: Layer 'shifted_patch_tokenization_10' looks like it has unbuilt state, but Keras is not able to trace the layer `call()` in order to build it automatically. Possible causes:
1. The `call()` method of your layer may be crashing. Try to `__call__()` the layer eagerly on some test input first to see if it works. E.g. `x = np.random.random((3, 4)); y = layer(x)`
2. If the `call()` method is correct, then you may need to implement the `def build(self, input_shape)` method on your layer. It should create all variables used by the layer (e.g. by calling `layer.build()` on all its children layers).

I have tried with both keras-core and keras-nightly.

They can be reproduced here:

  1. using keras-core
  2. using keras-nightly

Logging here as per #18468 and #18467.

Note: I also tried to debug by calling that layer on test input.
image

@fchollet
Copy link
Member

This layer creates two child layers:

self.projection = layers.Dense(units=projection_dim)
self.layer_norm = layers.LayerNormalization(epsilon=LAYER_NORM_EPS)

Normally, the build() method of the layer should build children layers. If you don't do it, then the framework attempts to do it, but it may not succeed (like in this case).

If you're able to call the layer eagerly, then maybe you just need to add a build() method that builds the two child layers.

@pksX01
Copy link
Contributor Author

pksX01 commented Nov 19, 2023

I am not getting this error in below code:

image = x_train[np.random.choice(range(x_train.shape[0]))]
resized_image = tf.image.resize(
    tf.convert_to_tensor([image]), size=(IMAGE_SIZE, IMAGE_SIZE)
)

# Vanilla patch maker: This takes an image and divides into
# patches as in the original ViT paper
(token, patch) = ShiftedPatchTokenization(vanilla=True)(resized_image / 255.0)
(token, patch) = (token[0], patch[0])

OR
(token, patch) = ShiftedPatchTokenization(vanilla=False)(resized_image / 255.0)

Only I am getting error when below line of code is getting called:

inputs = layers.Input(shape=INPUT_SHAPE)
# Augment data.
augmented = data_augmentation(inputs)
# Create patches.
(tokens, _) = ShiftedPatchTokenization(vanilla=vanilla)(augmented)

data augmentation is being done as below:

data_augmentation = keras.Sequential(
    [
        layers.Normalization(),
        layers.Resizing(IMAGE_SIZE, IMAGE_SIZE),
        layers.RandomFlip("horizontal"),
        layers.RandomRotation(factor=0.02),
        layers.RandomZoom(height_factor=0.2, width_factor=0.2),
    ],
    name="data_augmentation",
)
# Compute the mean and the variance of the training data for normalization.
data_augmentation.layers[0].adapt(x_train)

My suspect is something happening during data augmentation. May be shapes are getting changed because now I am getting different error:

/usr/local/lib/python3.10/dist-packages/keras/src/layers/layer.py:1216: UserWarning: Layer 'shifted_patch_tokenization_2' looks like it has unbuilt state, but Keras is not able to trace the layer `call()` in order to build it automatically. Possible causes:
1. The `call()` method of your layer may be crashing. Try to `__call__()` the layer eagerly on some test input first to see if it works. E.g. `x = np.random.random((3, 4)); y = layer(x)`
2. If the `call()` method is correct, then you may need to implement the `def build(self, input_shape)` method on your layer. It should create all variables used by the layer (e.g. by calling `layer.build()` on all its children layers).
Exception encoutered: ''Shapes used to initialize variables must be fully-defined (no `None` dimensions). Received: shape=(None, 64) for variable path='shifted_patch_tokenization_2/dense_2/kernel'''
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/keras/src/layers/layer.py:357: UserWarning: `build()` was called on layer 'shifted_patch_tokenization_2', however the layer does not have a `build()` method implemented and it looks like it has unbuilt state. This will cause the layer to be marked as built, despite not being actually built, which may cause failures down the line. Make sure to implement a proper `build()` method.
  warnings.warn(
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-17-e0427a96856a>](https://localhost:8080/#) in <cell line: 83>()
     81 
     82 # Run experiments with the vanilla ViT
---> 83 vit = create_vit_classifier(vanilla=True)
     84 history = run_experiment(vit)
     85 

2 frames
[<ipython-input-11-51a443bf6d57>](https://localhost:8080/#) in call(self, images)
     91         else:
     92             # Linearly project the flat patches
---> 93             tokens = self.projection(flat_patches)
     94         return (tokens, patches)

RuntimeError: Exception encountered when calling ShiftedPatchTokenization.call().

Could not automatically infer the output shape / dtype of 'shifted_patch_tokenization_2' (of type ShiftedPatchTokenization). Either the `ShiftedPatchTokenization.call()` method is incorrect, or you need to implement the `ShiftedPatchTokenization.compute_output_spec() / compute_output_shape()` method. Error encountered:

Shapes used to initialize variables must be fully-defined (no `None` dimensions). Received: shape=(None, 64) for variable path='shifted_patch_tokenization_2/dense_2/kernel'

Arguments received by ShiftedPatchTokenization.call():
  • args=('<KerasTensor shape=(None, 72, 72, 3), dtype=float32, sparse=False, name=keras_tensor_11>',)
  • kwargs=<class 'inspect._empty'>

@fchollet
Copy link
Member

Shapes used to initialize variables must be fully-defined (no None dimensions). Received: shape=(None, 64) for variable path='shifted_patch_tokenization_2/dense_2/kernel'

So presumably a layer is creating variables based on input_shape and some of these dimensions have a None. When creating a variable the shape must be fully known.

@sachinprasadhs sachinprasadhs added stat:awaiting response from contributor and removed stat:awaiting keras-eng Awaiting response from Keras engineer labels Nov 20, 2023
@pksX01
Copy link
Contributor Author

pksX01 commented Nov 30, 2023

This error disappeared after using newly released Keras 3.0.0 and tensorflow 2.15.0. So, I think we can close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants