You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1、 About h,w
The sentence "(Hh) × (W w) matches the size of the output segmentation map and (h, w) is the relative stride of the initial token gird" in the paper indicate that h,w is the downsample stride of segmentation map, but when I reading the code, I feel confused how it works, throught rerange the token_logits and matrix multiplication we get the final segmentation map,which is as large as the input image. So why do you set the extra parameter h and w, and how do h,w relate with stride?
2、About F.unfold() Official Implement Code token_logits = F.unfold(token_logits, kernel_size=3, padding=1).reshape(B, -1, 9, H, W) # (B, C, 9, H, W) pseudocode in the paper # get neighbors for each cell y = rar(y, "B N K -> B K H W") nb = im2col(y, kernel_size=3, padding=1) nb = rar(nb, "B (K n) (H W) -> B H W n K")
The other is what does F.unfold() do in the code ,in the paper ,you show the process of proxy head using pseudocode,and say im2col( i.e. F.unfold() ) is using to get neighbors for each cell, I can not understand this well ,too.
Looking forward to your reply!!! Thank you ~~~
The text was updated successfully, but these errors were encountered:
1、 About h,w
The sentence "(Hh) × (W w) matches the size of the output segmentation map and (h, w) is the relative stride of the initial token gird" in the paper indicate that h,w is the downsample stride of segmentation map, but when I reading the code, I feel confused how it works, throught rerange the token_logits and matrix multiplication we get the final segmentation map,which is as large as the input image. So why do you set the extra parameter h and w, and how do h,w relate with stride?
2、About F.unfold()
Official Implement Code
token_logits = F.unfold(token_logits, kernel_size=3, padding=1).reshape(B, -1, 9, H, W) # (B, C, 9, H, W)
pseudocode in the paper
# get neighbors for each cell
y = rar(y, "B N K -> B K H W")
nb = im2col(y, kernel_size=3, padding=1)
nb = rar(nb, "B (K n) (H W) -> B H W n K")
The other is what does F.unfold() do in the code ,in the paper ,you show the process of proxy head using pseudocode,and say im2col( i.e. F.unfold() ) is using to get neighbors for each cell, I can not understand this well ,too.
Looking forward to your reply!!! Thank you ~~~
The text was updated successfully, but these errors were encountered: