Make leela2onnx Conv-nodes compatible with onnx2pytorch #1924

patrik-ha · 2023-10-10T15:58:49Z

Adds kernel_shape as a property to all generated Conv-nodes for generated ONNX-networks, to partially address #1736 . This was specifically done for the onnx2pytorch-package requiring. The ONNX-standard states that this field is optional, and if not provided it should be inferred just looking at the dimensionality of the weights. These changes make some networks generated by leela2onnx readable by onnx2pytorch.

I verified that it works with the network that comes bundled with the latest lc0-release (791556) at least. However, many of the networks fail since onnx2pytorch hasn't implemented layer-normalisation, (as was the case with many of the networks I tried (mostly from https://lczero.org/play/networks/bestnets/). I also considered making a PR over there implementing that, but it seems pretty dead.

I'm not sure if it is correct that this project should add stuff like this to accomodate such requirements in other projects. It is practical that it works with other software that people are likely to use, especially in this case since it's quite a small change (and follows an optional part of the standard). I'm a big advocate for making the barrier as low as possible for other people to tinker with lc0-networks (and for that it's nice that conversion between pytorch and ONNX works as well as it could), but probably not at the cost of more bloat. I haven't done any large contributions, so that would be up to someone else to judge, but I wouldn't lose any sleep if this wasn't merged.

borg323 · 2023-10-10T18:43:28Z

src/neural/onnx/builder.cc

@@ -135,12 +135,15 @@ std::string PopulateStdNodeFields(pblczero::NodeProto* node,
 std::string OnnxBuilder::Conv(const std::string& name,
                              const std::string& input_name,
                              const OnnxConst& kernel_weights,
-                              const OnnxConst& bias_weights, int pads) {
+                              const OnnxConst& bias_weights, 
+                              int filters,


Why not just do a 'kernel_weights.GetDimensions().back()` instead of passing the number of filters?

Agreed, that's better

borg323 · 2023-10-10T18:43:42Z

You can find an optional LayerNormalization replacement in borg323@bc18bec

patrik-ha · 2023-10-11T07:53:32Z

Thanks for that. I confirmed that all the nets I tested can now be loaded correctly by onnx2pytorch using your alternative LayerNorm-definition.

Is it worth considering making this the default implementation of LayerNorm for the ONNX converter? It introduces more complexity and makes the ONNX graph more complex to look at I guess, but since it's just the same operation defined more explicitly it's nice that it just works without any extra hassle. But that might again come down to how much one wants to accomodate projects like onnx2pytorch. (a somewhat dead project, but still the first thing that most people run into when they want to convert leela-onnx nets to pytorch)

I pushed some changes making that implementation the default for LayerNorm to this branch, but that can be ignored if it shouldn't be used.

borg323 · 2023-10-11T08:25:28Z

The last commit went too far, removing the LayerNormalization() operator unconditionally. I think the best way is to add a --onnx2pytorch switch to leele2onnx to control this. And while you are there, are there any other converter options that could be useful to make available to the user?

This reverts commit 0858bcd.

patrik-ha · 2023-10-12T07:57:01Z

Yeah I understand, I've reverted that and put it behind a switch instead. I've made the switch for the user explicitly for onnx2pytorch, but internally I think it's nicer that that option toggles whatever needs to be different for the converter to make networks that are compatible. (I.e. if the user sets onnx2pytorch, it internally enables alt_ln or whatever relevant behaviour that's needed. The main thing is that there shouldn't be an internal option just for onnx2pytorch/compatibility I mean)

Regarding other relevant things, I'm not completely sure, but I don't think so. I was thinking if it was possible to put behaviour solving #1735 behind a similar switch, but in that case it doesn't look viable anyway. It might be nice to add explicit switches to enable the alternate implementations of mish and layer-norm since they already have options ready to go internally, but I don't know if it's it's that relevant for most people. They could maybe be added as hidden to avoid cluttering --help.

borg323 · 2023-10-12T13:28:17Z

Do you think options to change the onnx opset or the ability to generate a fp16 modes (with either fp16 or fp32 input/outputs) would be useful?
Regarding the alternative implementations we already have, LayerNormalization is a special case since onnxruntime accepts it for every opset even though it was only added in opset 17, so a separate switch makes sense.
This is all unrelated to this PR, just asking here since you are using the onnx models generated by lc0.

patrik-ha · 2023-10-12T15:21:53Z

I think it's already quite easy to convert fp32 leela-models to fp16/mixed precision in general. The few times I have needed to, the converter I used worked for every net I tried at least. (I used https://onnxruntime.ai/docs/performance/model-optimizations/float16.html) It's also supposedly quite good with how it converts certain nodes if they aren't supported in fp16 for example, meaning that the leela-onnx-converter doesn't have to maintain workarounds for it in the meantime. And it's still something that might have to be maintained in the future if it was an explicit feature.

I haven't run into any cases where I've been explicitly limited by the opset of my model (i.e. "I couldn't run my model because my device/software/whatever doesn't support this opset"), but I've just been using them on up-to-date systems where I can install whatever I need, so I'm not a good reference point on that. It also seems like a burden to have to support it. But there may be some use-cases I'm not seeing, and if it's already something that has been committed to being maintained I guess it could be useful to expose that as an option.

So I guess for both of them I would say that they would be nice to have, but that I think that they might not be worth maintaining as explicit features. (solely based on my own usage though)

mooskagh · 2023-10-13T08:16:13Z

src/neural/onnx/converter.h

@@ -44,6 +44,7 @@ struct WeightsToOnnxConverterOptions {
  int batch_size = -1;
  int opset = 17;
  bool alt_mish = false;
+  bool alt_ln = false;


Maybe rename it so something slightly less cryptic? (alternative_layer_normalization or alt_layer_norm or at the very least add a comment). And I'd say the same with the command line option.

This comes from my original code that was only meant for tests...

I agree, fixed (should have considered that after using it)

…onnx-pytorch

…ro#1924) * Accomodating onnx2pytorch for conv-blocks * alternate onnx layernorm implementation with explicit fp32 casting * Switch for supporting onnx2pytorch Co-authored-by: borg323 <[email protected]> (cherry picked from commit ed4e669)

Accomodating onnx2pytorch for conv-blocks

78ca4c9

patrik-ha changed the title ~~Make some leela2onnx-generated networks play nice with onnx2pytorch~~ Make leela2onnx Conv-nodes compatible with onnx2pytorch Oct 10, 2023

borg323 reviewed Oct 10, 2023

View reviewed changes

patrik-ha and others added 3 commits October 11, 2023 08:46

Get kernel dims from weights, not as argument

94c5d7c

alternate onnx layernorm implementation with explicit fp32 casting

8762927

Make alternative LayerNorm default for ONNX

0858bcd

patrik-ha added 2 commits October 12, 2023 08:56

Revert "Make alternative LayerNorm default for ONNX"

4e01eac

This reverts commit 0858bcd.

Switch for supporting onnx2pytorch

5a0c48f

mooskagh reviewed Oct 13, 2023

View reviewed changes

patrik-ha added 3 commits October 13, 2023 10:38

Make option for alt-ln more readable

c73e7fb

Make option for alt-ln more readable

b7e250d

Merge branch 'onnx-pytorch' of https://github.com/patrik-ha/lc0 into …

f7fe6ea

…onnx-pytorch

mooskagh approved these changes Oct 13, 2023

View reviewed changes

borg323 merged commit ed4e669 into LeelaChessZero:master Nov 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make leela2onnx Conv-nodes compatible with onnx2pytorch #1924

Make leela2onnx Conv-nodes compatible with onnx2pytorch #1924

patrik-ha commented Oct 10, 2023

borg323 Oct 10, 2023

patrik-ha Oct 11, 2023

borg323 commented Oct 10, 2023

patrik-ha commented Oct 11, 2023

borg323 commented Oct 11, 2023

patrik-ha commented Oct 12, 2023

borg323 commented Oct 12, 2023

patrik-ha commented Oct 12, 2023

mooskagh Oct 13, 2023

borg323 Oct 13, 2023

patrik-ha Oct 13, 2023 •

edited

Loading

Make leela2onnx Conv-nodes compatible with onnx2pytorch #1924

Make leela2onnx Conv-nodes compatible with onnx2pytorch #1924

Conversation

patrik-ha commented Oct 10, 2023

borg323 Oct 10, 2023

Choose a reason for hiding this comment

patrik-ha Oct 11, 2023

Choose a reason for hiding this comment

borg323 commented Oct 10, 2023

patrik-ha commented Oct 11, 2023

borg323 commented Oct 11, 2023

patrik-ha commented Oct 12, 2023

borg323 commented Oct 12, 2023

patrik-ha commented Oct 12, 2023

mooskagh Oct 13, 2023

Choose a reason for hiding this comment

borg323 Oct 13, 2023

Choose a reason for hiding this comment

patrik-ha Oct 13, 2023 • edited Loading

Choose a reason for hiding this comment

patrik-ha Oct 13, 2023 •

edited

Loading