You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Assume a mask of [F, F, F, T, T]. In the encoder, this mask is expanded as follows:
slf_attn_mask = mask.unsqueeze(1).expand(-1, max_len, -1)
This results in the following mask:
[F, F, F, T, T]
[F, F, F, T, T]
[F, F, F, T, T]
[F, F, F, T, T]
[F, F, F, T, T]
The expanded mask is then passed into the scaled dot-product attention module. However, I think this may not be correct, as the fourth and fifth words should not be calculating attention at all.
I think the correct version should be:
[F, F, F, T, T]
[F, F, F, T, T]
[F, F, F, T, T]
[T, T, T, T, T]
[T, T, T, T, T]
Could someone clarify if this is an issue or a misunderstanding by me.
The text was updated successfully, but these errors were encountered:
Assume a mask of [F, F, F, T, T]. In the encoder, this mask is expanded as follows:
slf_attn_mask = mask.unsqueeze(1).expand(-1, max_len, -1)
This results in the following mask:
[F, F, F, T, T]
[F, F, F, T, T]
[F, F, F, T, T]
[F, F, F, T, T]
[F, F, F, T, T]
The expanded mask is then passed into the scaled dot-product attention module. However, I think this may not be correct, as the fourth and fifth words should not be calculating attention at all.
I think the correct version should be:
[F, F, F, T, T]
[F, F, F, T, T]
[F, F, F, T, T]
[T, T, T, T, T]
[T, T, T, T, T]
Could someone clarify if this is an issue or a misunderstanding by me.
The text was updated successfully, but these errors were encountered: