[RFC] Adding Torch to MHLO conversion #999

ZihengJiang · 2022-06-30T16:49:31Z

Hello everyone, we are from the AML (Applied Machine Learning) team at Bytedance. While the TorchMLIR project provides a set of MLIR dialect for Torch, which bridges PyTorch and MLIR ecosystems, in our development, we also have observed interest in supporting the MHLO dialect as a lowering target, which is still missing in the project.
In the RFC, we propose to add conversation from Torch dialect to MHLO (Meta HLO Dialect). The MHLO dialect can be found here (https://github.com/tensorflow/mlir-hlo) and the corresponding operation definition is here (https://www.tensorflow.org/mlir/hlo_ops). We would add this conversion as an independent pass and this could provide an extra path where torch mlir can also be incorporated into MHLO dialect, and thus enrich the applications of TorchMLIR.

Motivation:

The HLO (High Level Optimizer) IR offers a carefully fixed selected list of operations. MHLO supports a HLO-like compilation pipeline using MLIR and provides a uniform interface to compile and execute these optimized HLO programs with dynamic shape support.
While the MHLO dialect supports Tensorflow models by nature, if we have the conversion pass from Torch dialect to MHLO, we could convert and lower those different frontend models into a unified IR. This would reduce engineering effort and ease the subsequent backend work.

Proposal:

Add mlir-hlo(https://github.com/tensorflow/mlir-hlo) as a dependent submodule.
Add the conversion as passes within the existing TorchMLIR codebase, e.g. lib/Conversion/TorchToMhlo/*
Add the new conversion pipeline into https://github.com/llvm/torch-mlir/blob/main/lib/Dialect/TorchConversion/Transforms/Passes.cpp
Tests go into test/Conversion/TorchToMhlo/*
Make the DecomposeComplexOps to be configurable, since different backend may require different decomposition rules.

Benefits:

Bridging Torch dialect with MHLO dialect.
Enabling user to convert and lower those different frontend models (PyTorch, TensorFlow, etc.) into the unified MHLO IR. This would reduce engineering effort and ease the subsequent backend works.
Provide more accessibility of Torch to other MLIR based projects with this extra path from Torch to MHLO

Status:

We have already implemented the whole lowering pipeline and operators conversion to covert BERT and ResNet inference. See the POC in the PR: #1025 . Welcome everyone interested being part of this effort as well.

Looking forward to any feedback!

cc @silvasean @powderluv @byronyi @Vremold

Update (07/13/2022):

According to the offline meeting between Bytedance and Alibaba. We decide to break the proposal into several PRs:

Integrate MHLO into TorchMLIR Repo
Basic conversion pipeline
Operator conversions
ResNet and BERT examples
End-to-end unit tests

ZihengJiang · 2022-06-30T16:52:43Z

I also learned that @fortianyou is working on the similar thing. Discussion and collaboration are more than welcome.

silvasean · 2022-06-30T18:03:33Z

Thanks. I'm glad to see that multiple parties are interested in this. It sounds like there might be some duplicated effort, so let's try to land this soon so we can collaborate. All the code should be shared, and we will need to reconcile the lowering patterns between Bytedance and Alibaba.

For reference, Alibaba's code is here: https://github.com/alibaba/BladeDISC/blob/main/pytorch_blade/src/torch-mlir/lib/Conversion/TorchToMhlo/TorchToMhlo.cpp

nit: let's call it TorchToMhlo for consistency (instead of TorchToMHLO)

powderluv · 2022-06-30T19:45:02Z

Thank you for this effort.

tanyokwok · 2022-07-01T01:35:03Z

I also learned that @fortianyou is working on the similar thing. Discussion and collaboration are more than welcome.

That's exciting! @ZihengJiang @byronyi

navahgar · 2022-07-06T19:30:11Z

Since this proposal is suggesting MHLO as a unified IR, the ownership of MHLO dialect should be considered as well. Is it going to be MLIR-community owned?

bhack · 2022-07-06T20:15:45Z

Since this proposal is suggesting MHLO as a unified IR, the ownership of MHLO dialect should be considered as well. Is it going to be MLIR-community owned?

There was an interesting related thread at:
https://discourse.llvm.org/t/rfc-bring-mhlo-and-lhlo-dialects-into-mlir-upstream/3186

silvasean · 2022-07-06T20:18:08Z

I've invited the lead of the MHLO efforts to the meeting, who can speak about governance. In general, our bar for this can be different than upstream MLIR.

tanyokwok · 2022-07-07T03:42:02Z

It's nice that @silvasean had created the meeting invitation here: https://discourse.llvm.org/t/asia-friendly-developer-hour-mhlo-support/63625. FYI

tanyokwok · 2022-07-07T05:38:48Z

Since this proposal is suggesting MHLO as a unified IR, the ownership of MHLO dialect should be considered as well. Is it going to be MLIR-community owned?

There was an interesting related thread at: https://discourse.llvm.org/t/rfc-bring-mhlo-and-lhlo-dialects-into-mlir-upstream/3186

I had noticed that the main concern on upstreaming of mlir/hlo is its relation to XLA/HLO and TPU in the community. Indeed BladeDISC didn't run into trouble on this in practice. And we benefit from mlir/hlo a lot.

Also, I agree with @silvasean , that torch-mlir has a different bar than upstream MLIR. Community folks might be interested in building a new backend based on torch-mlir and mlir/hlo. I think this is in line with the vision of the torch-mlir community.

stellaraccident · 2022-07-07T06:53:00Z

There are various plans at Google to support these cases better, and Eugene can speak to some of that at tomorrow's meeting -- and get feedback from you on what would be useful.

When this came up in the past, the issue really hinged on development model disconnects and a lack of staffing to better support it in a broader construct -- no one was really ready to support it outside of the limited cases it was serving. There have been changes on both of those fronts, so hopefully this next conversation can be more productive than the previous ones.

(Disclosure: Sean and I work at Google but on a different team than MHLO -- we both have visibility into the plans and are trying to get the folks who own this part engaged here to work on it together. Hopefully that is what tomorrow's meeting will accomplish. This proposal kind of came unexpectedly, and we weren't quite ready to talk about everything yet but will do our best to be helpful)

ZihengJiang · 2022-07-07T20:39:30Z

Thanks for all the discussion. Looking forward to today's meeting.

FYI, here is the POC(#1025) for this RFC, in which we have implemented BERT and ResNet conversion from Torch to MHLO. We are going to refine it in next days.

cc @fortianyou @silvasean @powderluv @stellaraccident @navahgar @bhack

silvasean · 2022-07-07T20:59:08Z

Thanks @ZihengJiang for posting the PR! One thing I noticed was that Alibaba has dynamic shapes as a requirement, and some of the patterns that I see in the PR you linked are tied to static shapes. So we will want to use the Alibaba patterns when they are available. For example:

https://github.com/alibaba/BladeDISC/blob/9043010a4e33f2b809a6db9933398877e39a5660/pytorch_blade/src/torch-mlir/lib/Conversion/TorchToMhlo/MhloLegalizeUtils.cpp#L220

ZihengJiang · 2022-07-07T21:54:33Z

Hi @silvasean

Thanks for the suggestion! Dynamic shape support is a good point. We are happy to update the PR to support this or collaborate with @fortianyou to distribute those pattern tasks, if Alibaba plans to contribute their code. We can discuss those options in the meeting.

bhack · 2022-07-08T10:33:20Z

Any meeting log or summary on this topic?

silvasean · 2022-07-08T20:09:13Z

The meeting notes are in: https://discourse.llvm.org/t/asia-friendly-developer-hour-mhlo-support/63625/3

ZihengJiang · 2022-07-13T18:37:08Z

According to the offline meeting between Bytedance and Alibaba. We decide to break the proposal into several PRs:

Integrate MHLO into TorchMLIR Repo
Basic conversion pipeline
Operator conversions
ResNet and BERT examples

Looking forward to working together and landing this. cc @fortianyou @silvasean

silvasean · 2022-07-13T18:52:17Z

I would like the initial PR to add support for the e2e test suite too, even if only one test (like tanh) passes. That way, we can see a minimal working system in a single PR to make it easier to review and make sure that all the pieces are present.

See RFC: llvm#999 Co-authored-by: Bairen Yi [email protected] Co-authored-by: Jiawei Wu [email protected] Co-authored-by: Tianyou Guo [email protected] Co-authored-by: Xu Yan [email protected] Co-authored-by: Ziheng Jiang [email protected]

See RFC: #999 Co-authored-by: Bairen Yi [email protected] Co-authored-by: Jiawei Wu [email protected] Co-authored-by: Tianyou Guo [email protected] Co-authored-by: Xu Yan [email protected] Co-authored-by: Ziheng Jiang [email protected]

See RFC: llvm#999 Co-authored-by: Bairen Yi [email protected] Co-authored-by: Jiawei Wu [email protected] Co-authored-by: Tianyou Guo [email protected] Co-authored-by: Xu Yan [email protected] Co-authored-by: Ziheng Jiang [email protected]

See RFC llvm#999 Co-authored-by: Bairen Yi [email protected] Co-authored-by: Jiawei Wu [email protected] Co-authored-by: Tianyou Guo [email protected] Co-authored-by: Xu Yan [email protected] Co-authored-by: Ziheng Jiang [email protected]

See RFC #999 Co-authored-by: Bairen Yi [email protected] Co-authored-by: Jiawei Wu [email protected] Co-authored-by: Tianyou Guo [email protected] Co-authored-by: Xu Yan [email protected] Co-authored-by: Ziheng Jiang [email protected]

See RFC llvm#999 Co-authored-by: Bairen Yi [email protected] Co-authored-by: Jiawei Wu [email protected] Co-authored-by: Tianyou Guo [email protected] Co-authored-by: Xu Yan [email protected] Co-authored-by: Ziheng Jiang [email protected]

See RFC #999 Co-authored-by: Bairen Yi [email protected] Co-authored-by: Jiawei Wu [email protected] Co-authored-by: Tianyou Guo [email protected] Co-authored-by: Xu Yan [email protected] Co-authored-by: Ziheng Jiang [email protected]

* [MHLO] Init MHLO pooling-like op conversion and remove 'op' suffix in filenames Co-authored-by: Bairen Yi <[email protected]> Co-authored-by: Jiawei Wu <[email protected]> Co-authored-by: Tianyou Guo [email protected] Co-authored-by: Xu Yan <[email protected]> Co-authored-by: Ziheng Jiang <[email protected]> See RFC #999

* [MHLO] fix tensor mode aten.div op pattern See RFC #999 Co-authored-by: Bairen Yi <[email protected]> Co-authored-by: Jiawei Wu <[email protected]> Co-authored-by: Tianyou Guo <[email protected]> Co-authored-by: Xu Yan <[email protected]> Co-authored-by: Ziheng Jiang <[email protected]>

See RFC #999 Co-authored-by: Bairen Yi [email protected] Co-authored-by: Jiawei Wu [email protected] Co-authored-by: Tianyou Guo [email protected] Co-authored-by: Xu Yan [email protected] Co-authored-by: Ziheng Jiang [email protected]

ZihengJiang · 2022-08-23T23:51:11Z

With the ResNet and BERT examples (from Torch to MHLO) has been merged into the main branch, we could say that we have completed the promise we made in the RFC. Thanks to everyone for making this happen! @fortianyou @Vremold @Yancey1989 @silvasean @powderluv

silvasean · 2022-08-23T23:55:54Z

Amazing work folks!! I can't believe you did all this in < 2 months!

tanyokwok · 2022-08-24T00:38:56Z

@ZihengJiang Congrats!!

yaochengji mentioned this issue Jul 7, 2022

[RFC] Add Legalization from ONNX-MLIR to MHLO onnx/onnx-mlir#1514

Open

ZihengJiang mentioned this issue Jul 7, 2022

[MHLO] Init Torch to MHLO conversion. #1025

Closed

4 tasks

ZihengJiang mentioned this issue Jul 19, 2022

[MHLO] Init MHLO integration. #1083

Merged

tanyokwok mentioned this issue Jul 21, 2022

[MHLO] Init MHLO view like op patterns #1090

Merged

tanyokwok mentioned this issue Jul 21, 2022

[MHLO] Init MHLO slice like op patterns #1091

Merged

Vremold mentioned this issue Jul 21, 2022

[MHLO] Init MHLO basic op conversion #1092

Merged

tanyokwok mentioned this issue Aug 2, 2022

[MHLO] Init MHLO linear op patterns #1132

Merged

Vremold mentioned this issue Aug 2, 2022

[MHLO] Init MHLO reduce-like op conversion #1133

Merged

tanyokwok mentioned this issue Aug 2, 2022

[MHLO] support non-constant torch scalar in BasicOps #1134

Merged

Vremold mentioned this issue Aug 3, 2022

[MHLO] Init MHLO pooling-like op conversion #1141

Merged

Vremold mentioned this issue Aug 4, 2022

[MHLO] Add convolution op pattern #1152

Merged

tanyokwok mentioned this issue Aug 5, 2022

[MHLO] fix tensor mode aten.div op pattern #1160

Merged

This was referenced Aug 8, 2022

[MHLO] Add transposed convolution conversion pattern #1171

Merged

[MHLO] Add configurable op decomposition rules #1172

Closed

[MHLO] Add AtenCatOp conversion pattern to MHLO #1208

Merged

Yancey1989 mentioned this issue Aug 11, 2022

[MHLO]add native_dropout and related ops pattern #1211

Merged

tanyokwok mentioned this issue Aug 14, 2022

[MHLO] Init end-to-end unit tests #1223

Merged

This was referenced Aug 16, 2022

[MHLO] Add AtenRSubScalarOp conversion pattern to MHLO #1233

Merged

[MHLO] bert-tiny and resnet18 e2e example from torchscript to mhlo #1266

Merged

ZihengJiang closed this as completed Aug 23, 2022

silvasean mentioned this issue Jan 3, 2023

[RFC] Bootstrapping the TCP Dialect in Torch-MLIR #1366

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Adding Torch to MHLO conversion #999

[RFC] Adding Torch to MHLO conversion #999

ZihengJiang commented Jun 30, 2022 •

edited

Loading

ZihengJiang commented Jun 30, 2022

silvasean commented Jun 30, 2022

powderluv commented Jun 30, 2022

tanyokwok commented Jul 1, 2022

navahgar commented Jul 6, 2022

bhack commented Jul 6, 2022 •

edited

Loading

silvasean commented Jul 6, 2022

tanyokwok commented Jul 7, 2022

tanyokwok commented Jul 7, 2022 •

edited

Loading

stellaraccident commented Jul 7, 2022

ZihengJiang commented Jul 7, 2022

silvasean commented Jul 7, 2022

ZihengJiang commented Jul 7, 2022 •

edited

Loading

bhack commented Jul 8, 2022 •

edited

Loading

silvasean commented Jul 8, 2022

ZihengJiang commented Jul 13, 2022

silvasean commented Jul 13, 2022

ZihengJiang commented Aug 23, 2022

silvasean commented Aug 23, 2022

tanyokwok commented Aug 24, 2022

[RFC] Adding Torch to MHLO conversion #999

[RFC] Adding Torch to MHLO conversion #999

Comments

ZihengJiang commented Jun 30, 2022 • edited Loading

Motivation:

Proposal:

Benefits:

Status:

Update (07/13/2022):

ZihengJiang commented Jun 30, 2022

silvasean commented Jun 30, 2022

powderluv commented Jun 30, 2022

tanyokwok commented Jul 1, 2022

navahgar commented Jul 6, 2022

bhack commented Jul 6, 2022 • edited Loading

silvasean commented Jul 6, 2022

tanyokwok commented Jul 7, 2022

tanyokwok commented Jul 7, 2022 • edited Loading

stellaraccident commented Jul 7, 2022

ZihengJiang commented Jul 7, 2022

silvasean commented Jul 7, 2022

ZihengJiang commented Jul 7, 2022 • edited Loading

bhack commented Jul 8, 2022 • edited Loading

silvasean commented Jul 8, 2022

ZihengJiang commented Jul 13, 2022

silvasean commented Jul 13, 2022

ZihengJiang commented Aug 23, 2022

silvasean commented Aug 23, 2022

tanyokwok commented Aug 24, 2022

ZihengJiang commented Jun 30, 2022 •

edited

Loading

bhack commented Jul 6, 2022 •

edited

Loading

tanyokwok commented Jul 7, 2022 •

edited

Loading

ZihengJiang commented Jul 7, 2022 •

edited

Loading

bhack commented Jul 8, 2022 •

edited

Loading