Use Value dim shape for Attention compute_output_shape #19284

sampathweb · 2024-03-11T17:59:37Z

Fixes #19257 by using Value dim shape for compute_output_shape of Attention layer.

fchollet

Thanks for the PR! Unit tests are failing, I think some golden values need to be updated.

fchollet · 2024-03-11T18:16:44Z

keras/layers/attention/attention_test.py

+    def test_attention_compute_output_shape(self):
+        layer = layers.Attention()
+        input_shape = [(2, 8, 7), (2, 8, 5), (2, 8, 7)]  # Shapes of Q, V, K
+        self.assertAllEqual(layer.compute_output_shape(input_shape) == (2, 8, 5))


Please call the layer on an input and read its shape, to ensure match between actual shape and computed shape.

Updated to match it with output.shape

sampathweb · 2024-03-18T22:03:55Z

keras/dtype_policies/dtype_policy.py

@@ -173,9 +173,6 @@ def _parse_name(self, name):
            return "float16", "float32"
        elif name == "mixed_bfloat16":
            return "bfloat16", "float32"
-        elif name == "uint8":


This is redundant. Its addressed in the try block so removing it.

fchollet

LGTM, thanks!

…nse` Add qlora-like technique to `quantized_call` in `Dense` Update `save_own_variables` and `load_own_variables` Update `benchmark.py` update version string. Set dtype policy for uint8 (keras-team#19327) * Set Quantization policy for uint8 to float * Add uint8 to dtype_policies Use Value dim shape for Attention compute_output_shape (keras-team#19284) * Use Value dim shape for Attention compute_output_shape * Fix attention layer compute output shape * fix format * check compute_output_shape with output Update `quantized_call` in `EinsumDense` to support training with quantized weights

google-ml-butler bot added the size:XS label Mar 11, 2024

google-ml-butler bot assigned gbaned Mar 11, 2024

fchollet reviewed Mar 11, 2024

View reviewed changes

sampathweb added 2 commits March 18, 2024 14:06

Use Value dim shape for Attention compute_output_shape

49e6f8d

Fix attention layer compute output shape

ececf6e

sampathweb force-pushed the use-value-dim-shape-attention branch from 9b26297 to ececf6e Compare March 18, 2024 21:50

sampathweb added 2 commits March 18, 2024 14:51

fix format

7526ba2

check compute_output_shape with output

54f5956

sampathweb commented Mar 18, 2024

View reviewed changes

sampathweb added the kokoro:force-run label Mar 18, 2024

kokoro-team removed the kokoro:force-run label Mar 18, 2024

fchollet approved these changes Mar 18, 2024

View reviewed changes

google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Mar 18, 2024

fchollet merged commit b2ef949 into keras-team:master Mar 18, 2024
7 of 9 checks passed

google-ml-butler bot removed ready to pull Ready to be merged into the codebase kokoro:force-run labels Mar 18, 2024

sampathweb deleted the use-value-dim-shape-attention branch March 18, 2024 22:28

kwchan7 mentioned this pull request Mar 19, 2024

Keras 3 Attention layer value tensor dimension #19257

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Value dim shape for Attention compute_output_shape #19284

Use Value dim shape for Attention compute_output_shape #19284

sampathweb commented Mar 11, 2024

fchollet left a comment

fchollet Mar 11, 2024

sampathweb Mar 18, 2024 •

edited

Loading

sampathweb Mar 18, 2024 •

edited

Loading

fchollet left a comment

Use Value dim shape for Attention compute_output_shape #19284

Use Value dim shape for Attention compute_output_shape #19284

Conversation

sampathweb commented Mar 11, 2024

fchollet left a comment

Choose a reason for hiding this comment

fchollet Mar 11, 2024

Choose a reason for hiding this comment

sampathweb Mar 18, 2024 • edited Loading

Choose a reason for hiding this comment

sampathweb Mar 18, 2024 • edited Loading

Choose a reason for hiding this comment

fchollet left a comment

Choose a reason for hiding this comment

sampathweb Mar 18, 2024 •

edited

Loading

sampathweb Mar 18, 2024 •

edited

Loading