add scoring methods in Luong-style attention #15867

old-school-kid · 2022-01-06T18:05:05Z

Luong-style attention attention use three types of scoring methods, namely dot, general and concat. This can be found in the 3rd page of the original paper and explained here.
Implementation of the Layer can be found in this notebook

Auto closes #15866

Add scoring methods, particularly conact for Luong-based attention.

Added the tests test_shape_with_key_concat, test_shape_concat, test_calculate_scores_one_dim_with_scale_concat and test_calculate_scores_multi_dim_concat

fchollet

Thanks for the PR!

keras/layers/dense_attention.py

fchollet · 2022-01-07T19:24:15Z

keras/layers/dense_attention.py

@@ -242,6 +242,8 @@ class Attention(BaseDenseAttention):
      Defaults to `False`.
    dropout: Float between 0 and 1. Fraction of the units to drop for the
      attention scores. Defaults to 0.0.
+    score: One of {'dot', 'concat'}. 'dot' refers to dot multiplication
+      of query and key. 'concat' refers to hyperbolic tangent of query and key.


It's tanh of the sum. Why is this called concat in this case?

It's tanh of the sum

Its not exactly tanh of the sum. Key and query are concatenated much like score in AdditiveAttention layer (Bahdanau-style attention).

Why is this called concat in this case?

In the paper they call this score concat.

Sounds good then, but please improve the argument description. Currently the description seems rather confusing.

Is this okay?
"score_mode: One of {'dot', 'concat'}. 'dot' refers to dot product of query and key. 'concat' refers to hyperbolic tangent of query and key concatenated."

Do we need to describe the parameter score_type too?

Use:

score_mode: Function to use to compute attention scores, one of `{"dot", "concat"}`. `"dot"` refers to the dot product between the query and key vectors. `"concat"` refers to the hyperbolic tangent of the concatenation of the query and key vectors.

keras/layers/dense_attention.py

fchollet · 2022-01-07T20:36:52Z

keras/layers/dense_attention.py

    super(Attention, self).__init__(**kwargs)
    self.use_scale = use_scale
+    self.score_type= score
+    if self.score_type not in ['dot', 'concat']:
+      logging.warning(f'Score type {self.score_type} is unknown, '


This should be a ValueError

old-school-kid · 2022-01-08T17:38:08Z

Did the requested changes.

fchollet

Thanks for the update!

fchollet · 2022-01-10T17:17:15Z

keras/layers/dense_attention.py

+    if self.score_mode not in ['dot', 'concat']:
+      raise ValueError(
+                "Unknown score_mode. Acceptable values "
+                "are: ['dot', 'concat']"


Add:

f"Received: score_mode={score_mode}"

Also make sure you use ' and " consistently

fchollet · 2022-01-10T17:19:14Z

keras/layers/dense_attention.py

@@ -242,6 +242,8 @@ class Attention(BaseDenseAttention):
      Defaults to `False`.
    dropout: Float between 0 and 1. Fraction of the units to drop for the
      attention scores. Defaults to 0.0.
+    score: One of {'dot', 'concat'}. 'dot' refers to dot multiplication
+      of query and key. 'concat' refers to hyperbolic tangent of query and key.


Use:

score_mode: Function to use to compute attention scores, one of `{"dot", "concat"}`. `"dot"` refers to the dot product between the query and key vectors. `"concat"` refers to the hyperbolic tangent of the concatenation of the query and key vectors.

old-school-kid · 2022-01-10T21:13:20Z

@fchollet can you please take a look?
Thanks in advance

fchollet

Thanks for the contribution!

fchollet · 2022-01-12T17:36:43Z

keras/layers/dense_attention_test.py

+        dtype=np.float32)
+    attention_layer = dense_attention.Attention(score_mode='concat')
+    attention_layer.build(input_shape=([1, 2, 4], [1, 3, 4]))
+    actual = attention_layer._calculate_scores(query=q, key=k)


Test is failing: https://source.cloud.google.com/results/invocations/acd8077d-49cf-4ad9-a164-2c37718c3a47/targets/keras%2Fgithub%2Fubuntu%2Fcpu%2Fpresubmit/log

You may need to do actual = keras.backend.get_value(actual)

Is it because I am not initializing the parameter attention_v and only create it if score_mode= 'concat' ? Because score_mode='dot' is passing and the scale parameter is also initialized to None at the start itself.

So what do you get when you replace it with actual = keras.backend.get_value(actual)?

To maintain the uniformity in the tests I havent used this yet.
The error msg is :

Node: 'ReadVariableOp'
Could not find variable attention_v. This could mean that the variable has been deleted. In TF1, it can also mean the variable is uninitialized. Debug info: container=localhost, status error message=Container localhost does not exist. (Could not find resource: localhost/attention_v)
[[{{node ReadVariableOp}}]]

So initialized attention_v (now concat_score_weight) to None just like scale is done.

The error is unrelated to this. It will likely go away if you use get_value.

Did the requested changes.

fchollet · 2022-01-12T17:37:16Z

keras/layers/dense_attention_test.py

+    attention_layer = dense_attention.Attention(score_mode='concat')
+    attention_layer.build(input_shape=([1, 2, 4], [1, 3, 4]))
+    actual = attention_layer._calculate_scores(query=q, key=k)
+    attention_layer.attention_v = 1


I notice that it isn't clear what attention_v means. Can you rename it to a fully spelled out variable name?

Ah, so in concat score is the product of a learnable parameter (here attention_v) and the hyperbolic tangent of the concatenation of the query and key vectors.

v right here is what I am calling attention_v and as usual Wa is the scaling parameter and ht and hs are query and key respcetively.
A pytorch implementation of the same can be found here in code block 11.

Rename it to something that isn't an abbreviation, like attention_weights. See the Keras API style guide comments on naming: https://github.com/keras-team/governance/blob/master/keras_api_design_guidelines.md#carefully-weigh-whether-a-new-feature-should-be-included

Renamed attention_v to concat_score_weight. Does that work?

fchollet

Thanks for the update -- let's try merging again.

Imported from GitHub PR #15867 Luong-style attention attention use three types of scoring methods, namely dot, general and concat. This can be found in the 3rd page of... PiperOrigin-RevId: 422945393

Imported from GitHub PR #15867 Luong-style attention attention use three types of scoring methods, namely dot, general and concat. This can be found in the 3rd page of... PiperOrigin-RevId: 422950111

Imported from GitHub PR #15867... PiperOrigin-RevId: 423145187

Imported from GitHub PR #15867... PiperOrigin-RevId: 423165160

old-school-kid added 2 commits January 6, 2022 22:14

add: scoring methods for Luong Attention

ecc0bb9

Add scoring methods, particularly conact for Luong-based attention.

add: tests for Luong Attention with concat score

d74fae5

Added the tests test_shape_with_key_concat, test_shape_concat, test_calculate_scores_one_dim_with_scale_concat and test_calculate_scores_multi_dim_concat

google-ml-butler bot added the size:M label Jan 6, 2022

google-ml-butler bot assigned gbaned Jan 6, 2022

gbaned requested a review from fchollet January 7, 2022 15:12

google-ml-butler bot added the keras-team-review-pending Pending review by a Keras team member. label Jan 7, 2022

fchollet reviewed Jan 7, 2022

View reviewed changes

old-school-kid added 2 commits January 8, 2022 23:02

updated attention layer

f73f6ec

fix: rename score_type to score_mode

ca62313

gbaned requested a review from fchollet January 9, 2022 02:30

fchollet reviewed Jan 10, 2022

View reviewed changes

fix:string

c717e7f

gbaned requested a review from fchollet January 11, 2022 09:10

fchollet approved these changes Jan 11, 2022

View reviewed changes

google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Jan 11, 2022

kokoro-team removed the kokoro:force-run label Jan 11, 2022

fchollet reviewed Jan 12, 2022

View reviewed changes

fix: rename attention_v

18ab3ed

google-ml-butler bot removed the ready to pull Ready to be merged into the codebase label Jan 14, 2022

fix: add get_value in tests

02a7905

fchollet approved these changes Jan 15, 2022

View reviewed changes

google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Jan 15, 2022

kokoro-team removed the kokoro:force-run label Jan 15, 2022

fix: parameter score to score_mode

326cf80

google-ml-butler bot removed the ready to pull Ready to be merged into the codebase label Jan 15, 2022

fchollet approved these changes Jan 16, 2022

View reviewed changes

google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Jan 16, 2022

kokoro-team removed the kokoro:force-run label Jan 16, 2022

copybara-service bot merged commit 063aa1d into keras-team:master Jan 20, 2022

google-ml-butler bot removed the ready to pull Ready to be merged into the codebase label Jan 20, 2022

copybara-service bot mentioned this pull request Jan 20, 2022

PR #15867: add scoring methods in Luong-style attention #15921

Closed

copybara-service bot mentioned this pull request Jan 20, 2022

PR #15867: add scoring methods in Luong-style attention #15927

Closed

copybara-service bot pushed a commit that referenced this pull request Jan 20, 2022

PR #15867: add scoring methods in Luong-style attention

2439edb

Imported from GitHub PR #15867... PiperOrigin-RevId: 423145187

copybara-service bot pushed a commit that referenced this pull request Jan 20, 2022

PR #15867: add scoring methods in Luong-style attention

9030ebc

Imported from GitHub PR #15867... PiperOrigin-RevId: 423145187

copybara-service bot pushed a commit that referenced this pull request Jan 20, 2022

PR #15867: add scoring methods in Luong-style attention

427f49b

Imported from GitHub PR #15867... PiperOrigin-RevId: 423145187

copybara-service bot pushed a commit that referenced this pull request Jan 20, 2022

PR #15867: add scoring methods in Luong-style attention

1f2d0d8

Imported from GitHub PR #15867... PiperOrigin-RevId: 423165160

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add scoring methods in Luong-style attention #15867

add scoring methods in Luong-style attention #15867

old-school-kid commented Jan 6, 2022 •

edited

Loading

fchollet left a comment

fchollet Jan 7, 2022

old-school-kid Jan 8, 2022

fchollet Jan 10, 2022

old-school-kid Jan 10, 2022

fchollet Jan 10, 2022

fchollet Jan 7, 2022

old-school-kid commented Jan 8, 2022

fchollet left a comment

fchollet Jan 10, 2022

fchollet Jan 10, 2022

old-school-kid commented Jan 10, 2022

fchollet left a comment

fchollet Jan 12, 2022

old-school-kid Jan 13, 2022

fchollet Jan 13, 2022

old-school-kid Jan 14, 2022

fchollet Jan 14, 2022

old-school-kid Jan 15, 2022

fchollet Jan 12, 2022

old-school-kid Jan 13, 2022

fchollet Jan 13, 2022

old-school-kid Jan 14, 2022

fchollet left a comment

add scoring methods in Luong-style attention #15867

add scoring methods in Luong-style attention #15867

Conversation

old-school-kid commented Jan 6, 2022 • edited Loading

fchollet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

old-school-kid commented Jan 8, 2022

fchollet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

old-school-kid commented Jan 10, 2022

fchollet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fchollet left a comment

Choose a reason for hiding this comment

old-school-kid commented Jan 6, 2022 •

edited

Loading