-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature request] ROI Pooling layers #477
Comments
I agree. I've started sketching the structure of it in https://github.com/pytorch/vision/tree/layers?files=1 . |
Any movement on this? |
Hey Wadim, |
Great, will have a look! Thanks :) |
@fmassa any updates on this? I'm sure a lot of people would benefit from having a master branch version of this available soon. |
It would be super convenient to have this installed automatically with torch/torchvision |
Having the master branch have cpu/cuda layers officially requires a few additional changes, like providing wheels with the compiled binaries for each supported architecture, and I'm not looking at this at the moment. |
Just wondering if the ROI pooling/align could theoretically be done in pure Pytorch (even if it will be slow?) |
Was thinking about the same... |
@kevinlu1211 it is possible to implement it using pure PyTorch, and performance is OK. |
You are a life saver! I’m just writing a tutorial to explain mask rcnn
thanks a lot!
…On Tue, 19 Jun 2018 at 11:44 pm, Francisco Massa ***@***.***> wrote:
@kevinlu1211 <https://github.com/kevinlu1211> it is possible to implement
it using pure PyTorch, and performance is OK.
An (old, badly tested) implementation can be found in
https://github.com/pytorch/examples/pull/21/files#diff-7573d025c4128229f8efa3ff042e09d1R38
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#477 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AME8D-m7bWhfCgW9yzr5VXr8fwUFcI4Sks5t-QAkgaJpZM4Tifuk>
.
|
@fmassa I find it surprising this is not higher priority. The fact that every other major deep learning framework supports ROI Pooling and there is no easy way to write a Pytorch version of Detectron for research purposes despite the deep integration between Pytorch and Caffe2 is bewildering. Is there some other way we can push this forward if you're too busy? I'm sure we can find volunteers to push this out the door as soon as possible. |
Come on @fmassa, make us all happy. If you don't have time, I'd gladly help! |
Yeah, captain @fmassa you have almost an army of volunteers that wait for your orders :) |
@fmassa I guess I've figured out how to get the CppExtension module to work for me and I should be able to finish this feature. I see you have TODOs to pull some common CUDA utilities out into a common file. Any other things you'd like to do before I make a PR? |
Hey guys, sorry for the delay here. So, there are a number of things that should be done in order to be able to put this in torchvision:
I've been doing some great progress on Detectron, and I've currently moved all those layers to the detectron repo for the moment. I'm currently hesitating if I should put those layers in torchvision because of the aforementioned difficulties. What do you guys think? |
Is the last issue a constantly persisting one? All the others I do not perceive to be big problems for a WIP branch, really. But it would enable everyone to have a working, if temporary, in-house pytorch solution. |
@fmassa I believe I can take care of everything else except Wheel generation since I'm not familiar with the python packaging pipeline at FB. As @wadimkehl mentioned, is there a checkpoint we can use for CppExtension and ATen? I used the latest master of PyTorch as of yesterday and your branch compiles fine as is. |
@varunagrawal concerning wheels and packaging you can take a look at pytorch/builder @fmassa I think torchvision can have a scope to provide models/datasets/transforms for tasks like
IMHO, ROI Pooling is very specific to an architecture and if torchvision is not intended to merge inside itself the research on faster-rcnn-like nets, this can be avoided. Any thoughts? |
@vfdev-5 your suggestion would turn this into a Chicken & Egg problem since we need ROI Pooling to implement even a basic RCNN model. Given that Detectron supports Faster RCNN, Caffe2 is now intrinsically linked to Pytorch, and 2 stage detectors are still highly looked into in research (e.g. Light Head RCNN) and industry, having a ROI pooling/align layer would be beneficial for torchvision overall. While I agree with your categorization for different tasks such as classification, segmentation and detection, doing so would require significant effort which the Pytorch team isn't able to provide given the priority of v1. Indeed, I have already spoken to @soumith about a separate repo for detection and segmentation related tasks and he's shown considerable interest. Until then, and looking at the large amount of interest on this issue, having the layers here for now would be sufficient. |
Sorry for the delay in replying. @wadimkehl @varunagrawal as of today, my branch doesn't compile anymore on latest PyTorch because of pytorch/pytorch#9435, and patches such as ngimel/pytorch@ae176af should be applied to This has been the case at least 3-4 times for me already, which means that supporting those extension layers officially in torchvision at the moment would be hard to maintain -- if the user updates PyTorch, torchvision breaks, if the user update torchvision but not pytorch, it also breaks, he needs to update both at the same time. This was a recurring issue with Lua-Torch, and I'd rather avoid it at the moment. About where to put the aforementioned layers, I'm not yet convinced on what is the right solution. |
@fmassa the good news is I have forked your branch and already made all the fixes. As of 07/27/2018, the ROI Pooling layer compiles successfully on my branch and I have also added a whole bunch of tests to check for correctness. I can submit the PR and continue to maintain ROI Pooling (and ROI Align hopefully soon) until we get more stability from ATen and checkpoint at either PyTorch v0.5 or v1. |
Sure, if you send a PR to the |
That works! For now let's point people towards the |
Added support for ROIAlign with #630. |
FYI, we have released our implementation of {Faster, Mask} R-CNN in https://github.com/facebookresearch/maskrcnn-benchmark , which contains the implementations for ROI Pooling and ROI Align. It currently doesn't have all the nice improvements that @varunagrawal has pushed to the I suggest we move this discussion there for now. |
It would be wonderful if the (working) ROI Pooling code in the layers branch could be updated and merged into torchvision. I think I speak for myself and many other vision researchers in that this is an essential functionality, and having it supported in the current torchvision is far less of a hassle than continuing to build this repo from source using an outdated branch. |
@fmassa do you want to reopen this issue until we can get all the related PRs merged? I'll update the original Issue comments with the PR numbers to help keep track. |
@varunagrawal I'm going to be merging the |
@fmassa When is the model Roi pooling available on the master branch? |
It already is with 0.3 |
Is anyone working on position sensitive ROI pooling similar to this one: https://github.com/tensorflow/models/blob/f9fe0fe97aee7964ac344ce38bafb20e977586dc/research/object_detection/utils/ops.py#L652? |
@LukasBommes there is an open PR adding it to torchvision, see #1259 |
Hi all, |
@MitraTj just add different RoIPool layers with different output sizes. |
Hi all, I would ask is there any implementation of an average version of ROI Pooling? |
@XuYunqiu there is |
@fmassa Thanks for your quick reply. Actually, I just want to get the mean values of each ROIs. |
I've actually been considering adding average pooling as an option to the ROI operations. It's not hard and allows for some nice generalization. |
Exactly, it will be helpful. |
@XuYunqiu yes, that is going to be doing roughly what you are looking for |
But it might not work well with ROIs with a large area. I think the output using bilinear interpolation only relevant to a quite local context of the sample location (i.e., the center of ROIs in my case). |
@fmassa Hi, sorry to bother you again. Would you mind to tell me which mode (average or max pool) is selected in |
I‘ve gotten my answer from the source code. It seems that only vision/torchvision/csrc/cuda/ROIAlign_cuda.cu Line 108 in 76702a0
I really hope |
Up to now, the torchvision still only implement the maxpool version of RoI-Pooling and avgpool version of RoI-align. For convinience, I fount that the mmcv have implement both mode for this two ops. |
It would be great to have support for various ROI Pooling operations as easy to add layers to facilitate research in object detection and semantic/instance segmentation.
Here is a live checklist:
General PRs: #626
The text was updated successfully, but these errors were encountered: