Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How does h,w in the paper and F.unfold()function in the code work? #11

Open
stte0v0 opened this issue Nov 16, 2022 · 0 comments
Open

Comments

@stte0v0
Copy link

stte0v0 commented Nov 16, 2022

1、 About h,w
The sentence "(Hh) × (W w) matches the size of the output segmentation map and (h, w) is the relative stride of the initial token gird" in the paper indicate that h,w is the downsample stride of segmentation map, but when I reading the code, I feel confused how it works, throught rerange the token_logits and matrix multiplication we get the final segmentation map,which is as large as the input image. So why do you set the extra parameter h and w, and how do h,w relate with stride?

2、About F.unfold()
Official Implement Code
token_logits = F.unfold(token_logits, kernel_size=3, padding=1).reshape(B, -1, 9, H, W) # (B, C, 9, H, W)
pseudocode in the paper
# get neighbors for each cell
y = rar(y, "B N K -> B K H W")
nb = im2col(y, kernel_size=3, padding=1)
nb = rar(nb, "B (K n) (H W) -> B H W n K")
The other is what does F.unfold() do in the code ,in the paper ,you show the process of proxy head using pseudocode,and say im2col( i.e. F.unfold() ) is using to get neighbors for each cell, I can not understand this well ,too.

Looking forward to your reply!!! Thank you ~~~

@stte0v0 stte0v0 changed the title What What does Nov 16, 2022
@stte0v0 stte0v0 changed the title What does How does h,w in the paper and F.unfold()function in the code work? Nov 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant