Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using GPU Gaussian blur at DarkPose unbiased decoding & megvii #332

Closed
HoBeom opened this issue Dec 4, 2020 · 5 comments
Closed

Using GPU Gaussian blur at DarkPose unbiased decoding & megvii #332

HoBeom opened this issue Dec 4, 2020 · 5 comments
Assignees
Labels

Comments

@HoBeom
Copy link
Contributor

HoBeom commented Dec 4, 2020

def _gaussian_blur(heatmaps, kernel=11):

It can be process in gpu using pytorch module
https://discuss.pytorch.org/t/gaussian-kernel-layer/37619

class GaussianLayer(nn.Module):
    def __init__(self):
        super(GaussianLayer, self).__init__()
        self.seq = nn.Sequential(
            nn.ReflectionPad2d(10), 
            nn.Conv2d(3, 3, 21, stride=1, padding=0, bias=None, groups=3)
        )

        self.weights_init()
    def forward(self, x):
        return self.seq(x)

    def weights_init(self):
        n= np.zeros((21,21))
        n[10,10] = 1
        k = scipy.ndimage.gaussian_filter(n,sigma=3)
        for name, f in self.named_parameters():
            f.data.copy_(torch.from_numpy(k))

or this
https://www.programmersought.com/article/17644345494/

class GaussianBlurConv(nn.Module):
    def __init__(self, channels=3):
        super(GaussianBlurConv, self).__init__()
        self.channels = channels
        kernel = [[0.00078633, 0.00655965, 0.01330373, 0.00655965, 0.00078633],
                  [0.00655965, 0.05472157, 0.11098164, 0.05472157, 0.00655965],
                  [0.01330373, 0.11098164, 0.22508352, 0.11098164, 0.01330373],
                  [0.00655965, 0.05472157, 0.11098164, 0.05472157, 0.00655965],
                  [0.00078633, 0.00655965, 0.01330373, 0.00655965, 0.00078633]]
        kernel = torch.FloatTensor(kernel).unsqueeze(0).unsqueeze(0)
        kernel = np.repeat(kernel, self.channels, axis=0)
        self.weight = nn.Parameter(data=kernel, requires_grad=False)
 
    def __call__(self, x):
        x = F.conv2d(x.unsqueeze(0), self.weight, padding=2, groups=self.channels)
@innerlee
Copy link
Contributor

innerlee commented Dec 4, 2020

Yeah many ops in cpu can be moved to gpu.
Ideally we should do a profiling and find all bottlenecks. If they are happens to be cpu ops, then we can re-implement them in gpu.

For tools of profile, ref the last comment in #73

@HoBeom
Copy link
Contributor Author

HoBeom commented Dec 4, 2020

thanks. I'll go over it using cProfile #73.
But I need some time until next week.

@HoBeom
Copy link
Contributor Author

HoBeom commented Dec 21, 2020

before gaussian blur
image
after using torch gpu
image

#378
sorry for pull requests (click miss 😄 )

@HoBeom
Copy link
Contributor Author

HoBeom commented Feb 15, 2021

It doesn't seem necessary. It is a very small performance improvement, but it requires a lot of modifications.
Thank you for comments. 👍 @innerlee

@HoBeom HoBeom closed this as completed Feb 15, 2021
rollingman1 pushed a commit to rollingman1/mmpose that referenced this issue Nov 5, 2021
@ykk648
Copy link

ykk648 commented Aug 11, 2022

@HoBeom Hey, your GaussianBlur gpu version may not proform good enough on your test, but for datasets like cocowholebody, there's 133 keypoints needs GaussianBlur to recover from heatmap, I find your codes helpful, made big model like HRNet48 dark reach batch realtime, thanks a lot !

HAOCHENYE pushed a commit to HAOCHENYE/mmpose that referenced this issue Jun 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants