Division by zero caused by mask operation #243

Chenyang-1024 · 2024-07-28T15:19:28Z

If each pixel in the input image does not belong to the q-th class, then when generating the mask for masked attention, attn_mask[b, q, :] = True will be converted to attn_mask[b, q, :] = float('-inf') in nn.MultiheadAttention. Finally, when attn_mask is used for the Softmax(attn_mask, dim=-1) operation to calculate the attention map, the NaN caused by the divide by 0 error will appear. : (
This problem came up when I applied masked attention to my semantic segmentation task. : (

The text was updated successfully, but these errors were encountered:

q1556450920 · 2024-11-19T01:47:00Z

attn_mask[torch.where(attn_mask.sum(-1) == attn_mask.shape[-1])] = False Is there no such line?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Division by zero caused by mask operation #243

Division by zero caused by mask operation #243

Chenyang-1024 commented Jul 28, 2024 •

edited

Loading

q1556450920 commented Nov 19, 2024

Division by zero caused by mask operation #243

Division by zero caused by mask operation #243

Comments

Chenyang-1024 commented Jul 28, 2024 • edited Loading

q1556450920 commented Nov 19, 2024

Chenyang-1024 commented Jul 28, 2024 •

edited

Loading