Will maskrcnn-benchmark support torch.jit.trace or torch.jit.script mode in the nearly future? #27

xxradon · 2018-10-26T06:03:37Z

❓ Questions and Help

As we know,in pytorch1.0 Torch Script is a way to create serializable and optimizable models from PyTorch code. Any code written in Torch Script can be saved from Python process and loaded in a process where there is no Python dependency.
So will maskrcnn-benchmark support torch.jit.trace or torch.jit.script mode in the nearly future?

fmassa · 2018-10-26T06:26:15Z

There are currently some python functionality in this codebase that is not supported by torch.jit.script, but which will be supported in the future.

Currently, you can trace almost all the model, except the custom C++ layers. Once we add support for those missing C++ layers by registering them as torch ops, I believe tracing should work without issues for same-sized images.

I'll look into registering the C++ layers into the torch ops

soumith · 2018-10-26T14:27:56Z

@fmassa the C++ layers can be registered with the JIT, see @goldsborough's slides from DevCon

fmassa · 2018-10-26T14:48:34Z

Yes, but I believe that it currently requires some extra code that follows a different codepath.
I'll check with Peter about that.

xxradon · 2018-10-27T14:13:15Z

@fmassa Thanks for your concern，if there is a way to trace the maskrcnn or any userful information,plesea let us know.

fmassa · 2018-10-27T15:46:22Z

I'll look into adding tracing support for the custom ops early this week, it should not be hard. I'll update on the issue once it's done

t-vi · 2018-10-27T16:56:21Z

Awesome!

hadim · 2018-11-01T02:11:29Z

I am also interested by this feature!

Eric-Zhang1990 · 2018-11-05T03:40:00Z

@fmassa Do you have added tracing support for the custom ops? Thanks.

t-vi · 2018-11-05T08:20:16Z

So I did look at this in some depth and continue to do so, here is a bit of a progress report for discussion. I'm also happy to share a branch with my code, but the code is even more "stream of consciousness" than this write-up.

Goal and and plan

My goal is to be able to detect in single images of a fixed size (known during tracing) in C++ as close as possibe to the "load traced model in C++" example.

My first step is to get something that

has scripted/traced paths for every "processing step" bit, and
manages to reproduce the output on the image it has been traced on (so I postpone variations that occur during detection for different scores, but I try not to screw up things too much),
I did take the "do whatever works and cleanup later" approach - it's really, really messy right now.
I think there will be JIT bits to think about on the PyTorch/JIT side, too.

My findings so far

C++ bits / Custom Ops

The mere addition of custom ops support (for inference) for the C++ ops seems easy:
- Change int -> int64_t, float -> double (I'm not 100% certain it's needed),
- link to libtorch.so and libcaffe2.so (it's probably silly to use the extension mechanism, but that is "clean up later").
- Add registry in vision.cpp

#include <torch/script.h>
...
static auto registry =
  torch::jit::RegisterOperators()
    .op("maskrcnn_benchmark::nms", &nms)
    .op("maskrcnn_benchmark::roi_align_forward(Tensor input, Tensor rois, float spatial_scale, int pooled_height, int pooled_width, int sampling_ratio) -> Tensor", &ROIAlign_forward);

Using them: In layers/nms.py:

import torch

nms = torch.ops.maskrcnn_benchmark.nms

Easy. However, I could not trace the resulting nms.
(torch.jit.trace(lambda x,y: maskrcnn_benchmark.layers.nms(x,y, 2), (torch.randn(5,5), torch.randn(5,5))) gives an error, it should not).
This can be worked around (for a fixed threshold, but that's OK, I think) by a double torchscript wrapper:

    @torch.jit.script
    def nms_fixed_thresh1(dets, scores, th: float=coco_demo.model.rpn.box_selector_test.nms_thresh):
        return maskrcnn_benchmark.layers.nms(dets, scores, th)

    @torch.jit.script
    def nms_fixed_thresh(dets, scores):
        return nms_fixed_thresh1(dets, scores)

Now we can trace nms_fixed_thresh in place of the lambda above. @goldsborough will want to know. :)

A similar wrapping trick was needed for roi align forward, I put that in the layer (where all the constants are parameters, so it's natural).

I did change some lists to tuples to make the jit happier.

Custom bookkeeping types (boxlist oh oh)

The jit isn't very fond of the boxlist things. Where it works, a minimal fix is to "unpack" the parameters of functions, assuming that all Tensors are arguments and all others are constants. That works reasonably well when operating on the same input again in traced mode. It remains to be seen if we run into generalization problems. To facilitate that, I added two methods to bounding_box:

    # note: _get_tensors/_set_tensors only work if the keys don't change in between!
    def _get_tensors(self):
        return (self.bbox,)+tuple(f for f in (self.get_field(field) for field in sorted(self.fields())) if isinstance(f, torch.Tensor))

    def _set_tensors(self, ts):
        self.bbox = ts[0]
        for i, f in enumerate(sorted(self.fields())):
            if isinstance(self.extra_fields[f], torch.Tensor):
                self.extra_fields[f] = ts[1 + i]

and there is some wrapper code.

Some things that don't work well with tracing/scripting

The box_coder uses

            pred_boxes = torch.zeros_like(rel_codes)
            # x1
            pred_boxes[:, 0::4] = pred_ctr_x - 0.5 * pred_w
            # y1
            pred_boxes[:, 1::4] = pred_ctr_y - 0.5 * pred_h
            # x2 (note: "- 1" is correct; don't be fooled by the asymmetry)
            pred_boxes[:, 2::4] = pred_ctr_x + 0.5 * pred_w - 1
            # y2 (note: "- 1" is correct; don't be fooled by the asymmetry)
            pred_boxes[:, 3::4] = pred_ctr_y + 0.5 * pred_h - 1

The jit doesn't love that, so I used:

            pred_boxes = torch.stack([pred_ctr_x - 0.5 * pred_w,
                                      pred_ctr_y - 0.5 * pred_h,
                                      pred_ctr_x + 0.5 * pred_w - 1,
                                      pred_ctr_y + 0.5 * pred_h - 1], 2).view(*rel_codes.shape)

Similarly the pooling over several levels in the roi_heads.box.feature_extractor.pooler forward seems problematic for tracing. As indexed assignment isn't for torch script, I wrote a new custom op to replace

    for level, (per_level_feature, pooler) in enumerate(zip(x, self.poolers)):
        idx_in_level = torch.nonzero(levels == level).squeeze(1)
        rois_per_level = rois[idx_in_level]
        result[idx_in_level] = pooler(per_level_feature, rois_per_level)

There still is a problem (a riddle?) around script wanting to pass a tensor list as a list of tensors and not a tuple, but tracing not accepting lists, I will have to sort that out. Maybe one could convince JIT people to allow passing tuples of tensors where the JIT wants lists of tensors.

Mask composition

This uses PIL at the moment, it'll be replaced. @fmassa has this for GPU, but I will do a CPU version and custom op for it.

Things that work at the moment

The backbone,
the rpn_head,
from what I understand the generated anchors only depend on the image size and are fixed for our purposes, so I didn't worry about that,
the post_processing forward for single feature map, I think it should not be too hard to make the entire postprocessing work, too.

So I'm now at the roi heads (as you can see above), the box first.

fmassa · 2018-11-05T10:01:34Z

That's awesome progress @t-vi !

We were aware of the problems that BoxList would bring to the JIT, but that's something that we discussed with @zdevito and team, and we will want to support it in the future (but I think that an approach similar to namedtuples (that we were thinking) might not be enough as is).
But can't the torch.jit.trace work with the BoxList objects? I thought it would work...

Indexing with the JIT doesn't work very well yet (but we are improving support for it), so the approach you followed for the box_coder seems good to me. I'd hope that we could avoid a custom op for the pooler, but for that we need to have better support for mutability and indexing in the script, which is planned and being worked on I believe.

I didn't quite understand the problem with tracing the constant parameters, but I suppose this is a bug in upstream PyTorch?

Thanks a lot for all your help!

t-vi · 2018-11-05T11:16:41Z

So I filed the two JIT observations as issues with PyTorch (see above).

Zehaos · 2018-12-15T14:35:10Z

It seems that pytorch/pytorch#13564 had been fixed.

t-vi · 2018-12-15T14:53:23Z

Yes, and we managed to do tracing in #138. There is a "regression" in 1.0 that invalidates the merge_levels script, so you'd currently need to replace it with a (very straightforward) custom op.

jordanxlj · 2018-12-25T13:22:40Z

@t-vi , I met a core dump bug, when I executed the trace_model.py from your patch. The core information is below.
(gdb) bt
#0 0x00007f4e28193eeb in mkldnn::impl::scales_t::set(int, int, float const*) ()
from /home/user/code/maskrcnn-benchmark/maskrcnn_benchmark/libmkldnn.so.0
#1 0x00007f4e281989e2 in mkldnn_primitive_desc_create_v2 () from /home/user/code/maskrcnn-benchmark/maskrcnn_benchmark/libmkldnn.so.0
#2 0x00007f4e2ea1a9bc in mkldnn::convolution_forward::primitive_desc::primitive_desc(mkldnn::convolution_forward::desc const&, mkldnn::engine const&) () from /home/user/code/maskrcnn-benchmark/maskrcnn_benchmark/libcaffe2.so
#3 0x00007f4e2ea16516 in at::native::mkldnn_convolution(at::Tensor const&, at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, long) () from /home/user/code/maskrcnn-benchmark/maskrcnn_benchmark/libcaffe2.so
#4 0x00007f4e2ebb718c in at::TypeDefault::mkldnn_convolution(at::Tensor const&, at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, long) const () from /home/user/code/maskrcnn-benchmark/maskrcnn_benchmark/libcaffe2.so
#5 0x00007f4e2d8c7c55 in torch::autograd::VariableType::mkldnn_convolution(at::Tensor const&, at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, long) const ()

nicolasCruzW21 · 2019-04-30T13:32:39Z

Hello, any progress on this? I am also very interested.
Thank yo!

tuboxin · 2019-05-03T07:49:03Z

Hi, any progress on this? Anyone who managed to do this?

imranparuk · 2019-06-20T08:28:09Z

I'm also interested in knowing about the progress on this.

ulyssesxxxi · 2019-07-26T09:28:05Z

Hi @t-vi @fmassa, I'm interested in using TVM to run one of the maskrcnn-benchmark models (specifically e2e_mask_rcnn_X-152-32x8d-FPN-IN5k_1.44x_caffe2), but during the conversion, it fails on torch.jit.trace() (because of BoxList). Any updates on this? Thx in advance.

dashesy · 2019-09-17T01:12:58Z

I get torch::jit::RegisterOperators is deprecated, what is the new way to register a C++ extension for tracing?

fmassa added the enhancement New feature or request label Oct 26, 2018

youngkyoonjang mentioned this issue Oct 26, 2018

Segmentation fault (core dumped) #21

Closed

fmassa mentioned this issue Oct 29, 2018

Converting e2e_faster_rcnn_X_101_32x8d_FPN_1x_caffe2.pkl to protobuf #24

Closed

This was referenced Nov 5, 2018

[JIT] Cannot trace custom ops pytorch/pytorch#13564

Closed

[jit] Have a way to trace custom ops with std::vector<Tensor> pytorch/pytorch#13566

Closed

hadim mentioned this issue Nov 5, 2018

ONNX model import #116

Open

t-vi mentioned this issue Nov 10, 2018

[WIP] Tracing / Scripting #138

Closed

dedoogong mentioned this issue Aug 16, 2019

RuntimeError: "SigmoidFocalLoss_forward" not implemented for 'Half' #1048

Open

hhbyyh mentioned this issue Sep 16, 2019

TorchNet support of Mask-R-CNN intel-analytics/analytics-zoo#994

Open

Jacobew mentioned this issue Apr 19, 2020

add dcn from mmdetection #693

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Will maskrcnn-benchmark support torch.jit.trace or torch.jit.script mode in the nearly future? #27

Will maskrcnn-benchmark support torch.jit.trace or torch.jit.script mode in the nearly future? #27

xxradon commented Oct 26, 2018

fmassa commented Oct 26, 2018

soumith commented Oct 26, 2018

fmassa commented Oct 26, 2018

xxradon commented Oct 27, 2018

fmassa commented Oct 27, 2018

t-vi commented Oct 27, 2018

hadim commented Nov 1, 2018

Eric-Zhang1990 commented Nov 5, 2018

t-vi commented Nov 5, 2018

fmassa commented Nov 5, 2018

t-vi commented Nov 5, 2018

Zehaos commented Dec 15, 2018 •

edited

Loading

t-vi commented Dec 15, 2018

jordanxlj commented Dec 25, 2018 •

edited

Loading

nicolasCruzW21 commented Apr 30, 2019

tuboxin commented May 3, 2019

imranparuk commented Jun 20, 2019

ulyssesxxxi commented Jul 26, 2019

dashesy commented Sep 17, 2019

Will maskrcnn-benchmark support torch.jit.trace or torch.jit.script mode in the nearly future? #27

Will maskrcnn-benchmark support torch.jit.trace or torch.jit.script mode in the nearly future? #27

Comments

xxradon commented Oct 26, 2018

❓ Questions and Help

fmassa commented Oct 26, 2018

soumith commented Oct 26, 2018

fmassa commented Oct 26, 2018

xxradon commented Oct 27, 2018

fmassa commented Oct 27, 2018

t-vi commented Oct 27, 2018

hadim commented Nov 1, 2018

Eric-Zhang1990 commented Nov 5, 2018

t-vi commented Nov 5, 2018

Goal and and plan

My findings so far

C++ bits / Custom Ops

Custom bookkeeping types (boxlist oh oh)

Some things that don't work well with tracing/scripting

Mask composition

Things that work at the moment

fmassa commented Nov 5, 2018

t-vi commented Nov 5, 2018

Zehaos commented Dec 15, 2018 • edited Loading

t-vi commented Dec 15, 2018

jordanxlj commented Dec 25, 2018 • edited Loading

nicolasCruzW21 commented Apr 30, 2019

tuboxin commented May 3, 2019

imranparuk commented Jun 20, 2019

ulyssesxxxi commented Jul 26, 2019

dashesy commented Sep 17, 2019

Zehaos commented Dec 15, 2018 •

edited

Loading

jordanxlj commented Dec 25, 2018 •

edited

Loading