Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Division by zero caused by mask operation #243

Open
Chenyang-1024 opened this issue Jul 28, 2024 · 1 comment
Open

Division by zero caused by mask operation #243

Chenyang-1024 opened this issue Jul 28, 2024 · 1 comment

Comments

@Chenyang-1024
Copy link

Chenyang-1024 commented Jul 28, 2024

If each pixel in the input image does not belong to the q-th class, then when generating the mask for masked attention, attn_mask[b, q, :] = True will be converted to attn_mask[b, q, :] = float('-inf') in nn.MultiheadAttention. Finally, when attn_mask is used for the Softmax(attn_mask, dim=-1) operation to calculate the attention map, the NaN caused by the divide by 0 error will appear. : (
This problem came up when I applied masked attention to my semantic segmentation task. : (
image

@q1556450920
Copy link

attn_mask[torch.where(attn_mask.sum(-1) == attn_mask.shape[-1])] = False Is there no such line?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants