Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full support for dynamic If #1062

Open
2 of 5 tasks
kali opened this issue Apr 25, 2023 · 19 comments
Open
2 of 5 tasks

Full support for dynamic If #1062

kali opened this issue Apr 25, 2023 · 19 comments

Comments

@kali
Copy link
Collaborator

kali commented Apr 25, 2023

ONNX If only works has a loading macro expansion: the condition must be known and fixed at loading time, the right branch is substituted in the type phase. This blocks Silero VAD v4 (see #1029).

  • make analyse work when input is dynamic
  • implement an If operator in core and translate ONNX op into it
  • find out how to encode this in NNEF
  • into_optimized as in Scan
  • declutter as in Scan
@kali
Copy link
Collaborator Author

kali commented Apr 25, 2023

Link #1038, covering two first tasks

@kali kali mentioned this issue Apr 25, 2023
5 tasks
@7r3nzy
Copy link
Contributor

7r3nzy commented May 21, 2023

@kali do you have any plans for implementing task 4 and 5 anytime soon :)? If possible, could you please leave some hints on how it can be implemented, I'd like to give it a try. Thank you.

@kali
Copy link
Collaborator Author

kali commented May 22, 2023

I think it should start with 5. There is a huge amount of code in Scan to declutter it (decluttering is actually implementing the high level optimization in tract terminology). I think most of what we do in Scan can be mapped to ifte but the Scan code is not generic. Plus we have an ongoing epic that aims at "flattening" Scan, inlining the loop body inside the main model to normalize the node connections so a lot of the Scan declutter code will probably go away or be drastically simplified. So if you start working on ifte and perform similar simplifications, we will later on have to go the same path and flattening the ifte too, making a lot of these efforts redundant.

So if you want to work on ifte decluttering, I think you should keep it use-case driven, not implementing the whole thing before we get more clarity about what is happening with Scan. Does it make sense ? If so, we should probably move this to a different issue, and target optimising the silero vad use-case, at least as a starting point. Refresh my memory. Are we in a position where we can look at a model dump (in the command line) enough to identify possible bottlenecks yet ?

@7r3nzy
Copy link
Contributor

7r3nzy commented Jun 4, 2023

@kali Thanks for the response. It does make sense. I think you are referring to this PR #1090 probably?

About the bottlenecks I don't think we got to that point, since running the latest tract on silero-vad v4 on main branch (146b2ac last commit) is throwing the following error:

tract silero-vad/files/silero_vad.onnx
┏ 1 Source sr
┃   ━━━ I64
┣┻ 73 Equals Equal_23
┃   ━━━ Bool
┃┏ 3 Source c
┃┃   ━━━ 2,batch,64,F32
┃┃┏ 2 Source h
┃┃┃   ━━━ 2,batch,64,F32
┃┃┃┏ 0 Source input
┃┃┃┃   ━━━ batch,sequence,F32
┣╋╋┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻ 74 If If_25
      ━━━ batch,1,F32
      ━━━ 2,batch,64,F32
      ━━━ 2,batch,64,F32
[2023-06-03T23:08:18.550720900Z ERROR tract] Error at stage analyse

    Caused by:
        0: Failed analyse for node #74 "If_25" If
        1: Failed analyse for node #74 "If_25" If
        2: Infering facts
        3: Unifying shapes Unsqueezeout_dim_0,1 and batch,1
        4: Impossible to unify Sym(Unsqueezeout_dim_0) with Sym(batch).

@kali
Copy link
Collaborator Author

kali commented Jun 5, 2023

@7r3nzy yes, I'm referring to 1090 (and a couple of them that will follow).

Is it not just again a problem of --onnx-ignore-output-shapes ? Maybe I should actually make ignoring them the default.

@7r3nzy
Copy link
Contributor

7r3nzy commented Jun 5, 2023

@kali Thanks for pointing that out :), It moved to next step I think but I am now getting:

 tract silero-vad/files/silero_vad.onnx --onnx-ignore-output-shapes
[2023-06-05T08:26:31.441282388Z ERROR tract] Error at stage type

    Caused by:
        0: Translating node #74 "If_25" If ToTypedTranslator
        1: Translating node #29 "If_69" If ToTypedTranslator
        2: in output_facts invocation for If
        3: Condition failed: `self.then_body.output_fact(i)?.without_value() == self.else_body.output_fact(i)?.without_value()` (batch,1,sequence+192,F32 vs batch,1,1,sequence+192,F32)

@kali
Copy link
Collaborator Author

kali commented Jun 5, 2023

Ok this looks more serious, you can use "--pass analyse" to see the network before it crashes. it looks like the "then" and "else" branch have a different rank here, with an extra dimension in the else...

@7r3nzy
Copy link
Contributor

7r3nzy commented Jun 5, 2023

It produced the following output without any crash:

tract silero-vad/files/silero_vad.onnx --onnx-ignore-output-shapes --pass analyse
┏ 1 Source sr
┃   ━━━ I64
┣┻ 73 Equals Equal_23
┃   ━━━ Bool
┃┏ 3 Source c
┃┃   ━━━ 2,batch,64,F32
┃┃┏ 2 Source h
┃┃┃   ━━━ 2,batch,64,F32
┃┃┃┏ 0 Source input
┃┃┃┃   ━━━ batch,sequence,F32
┣╋╋┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻┻ 74 If If_25
      ━━━ batch,1,F32
      ━━━ 2,batch,64,F32
      ━━━ 2,batch,64,F32

@kali
Copy link
Collaborator Author

kali commented Jun 5, 2023

Ha. And we don't see the then and else branch, it's not plugged in. Can you link me the onnx file again so I can have a look ?

@7r3nzy
Copy link
Contributor

7r3nzy commented Jun 5, 2023

Yep, here it is https://github.com/snakers4/silero-vad/archive/refs/tags/v4.0.tar.gz

Path inside the archive:
files/silero_vad.onnx

@kali
Copy link
Collaborator Author

kali commented Jun 5, 2023

Please have a look at #1102 . It's... complicated, but I guess the network is. I hope you can figure out where the discrep is.

@kali
Copy link
Collaborator Author

kali commented Jun 5, 2023

The PR just fixes the dump format so that If branch are shown in the graph.

@7r3nzy
Copy link
Contributor

7r3nzy commented Jun 6, 2023

Thank you so much for this, I used netron as well as the new tract patch to analyse the network:

[loop] ┃┃┣┻ 57 If If_69

┃┃┃ [loop|loop] ┏ 2 Source input_data1
┃┃┃ [loop|loop] ┃ ━━━ batch,1,1,sequence+192,F32
┃┃┃ [loop|loop] ┣┻ 1 Squeeze13 Squeeze_71
┃┃┃ [loop|loop] ━━━ batch,1,sequence+192,F32
┃┃┃ [loop|else] ┏ 1 Source input_data1
┃┃┃ [loop|else] ┃ ━━━ batch,1,1,sequence+192,F32
┃┃┃ [loop|else] ┣ 0 Identity Identity_72
┃┃┃ [loop|else] ━━━ batch,1,1,sequence+192,F32

If_25 then If_69 

last node name=176 | else type: float32[Squeeze175_dim_0,Identity176_dim_1,Squeeze175_dim_1,Squeeze175_dim_2]
last node name=175 | then type: float32[Squeeze175_dim_0,Squeeze175_dim_1,Squeeze175_dim_2]

I think the important thing to note from the onnx if operator specs is:

The then_branch and else_branch may produce tensors with the same element type and different shapes.

I am also leaving the onnx if operator specs here for easy access:

If

If conditional
Version

This version of the operator has been available since version 19 of the default ONNX operator set.

Other versions of this operator: 1, 11, 13, 16
Attributes

else_branch : graph (required)
Graph to run if condition is false. Has N outputs: values you wish to be live-out to the enclosing scope. The number of outputs must match the number of outputs in the then_branch.
then_branch : graph (required)
Graph to run if condition is true. Has N outputs: values you wish to be live-out to the enclosing scope. The number of outputs must match the number of outputs in the else_branch.

Inputs

cond : B
Condition for the if

Outputs (1 - ∞)

outputs (variadic, heterogeneous) : V
Values that are live-out to the enclosing scope. The return values in the then_branch and else_branch must be of the same data type. The then_branch and else_branch may produce tensors with the same element type and different shapes. If corresponding outputs from the then-branch and the else-branch have static shapes S1 and S2, then the shape of the corresponding output variable of the if-node (if present) must be compatible with both S1 and S2 as it represents the union of both possible shapes.For example, if in a model file, the first output of then_branch is typed float tensor with shape [2] and the first output of else_branch is another float tensor with shape [3], If's first output should have (a) no shape set, or (b) a shape of rank 1 with neither dim_value nor dim_param set, or (c) a shape of rank 1 with a unique dim_param. In contrast, the first output cannot have the shape [2] since [2] and [3] are not compatible.

@kali
Copy link
Collaborator Author

kali commented Jun 6, 2023

Wow. Different shapes? ONNX spec is borderline crazy here. I can manage to support different shapes of the same rank with symbols, but supporting different ranks is an absolute no for tract.

Can you distill down the use-case for me for different ranks ? I can't even imagine how that could be useful. I would assume it's nearly immediately followed by a Reshape right ? Can we push this Reshape up into the branches ?

@7r3nzy
Copy link
Contributor

7r3nzy commented Jun 6, 2023

It is immediately followed by Conv:

IF_69

Can we push this Reshape up into the branches ?

The problem is, I don't think silero-vad devs have mentioned/published their sources they used to train this model, so even if we change the model I'd be stuck on training it :(

Can you distill down the use-case for me for different ranks ?

My knowledge about the network is very limited, but I'll give it a try and get back to you if I find anything :)

@kali
Copy link
Collaborator Author

kali commented Jun 6, 2023

What kind of shapes is netron reporting ? I am surprised that the convolution can deal with the two axes configuration, as its weights have to be of the same rank than the input. Unless they altering the weight shapes too ?

We need to understand what they are doing, which operator collapse the two axes configuration into one. From there we can try and push back the change to unify the branches configuration inside the If. Do not worry too much yet about the sources and training, we can alter the model onnx file with a python script, probably just inserting a reshape/squeeze/unsqeeze in the right place to make the axes consistent, before handing it to tract.

@kali
Copy link
Collaborator Author

kali commented Jun 6, 2023

It may also be a tract bug while inferring shapes in one of the branches. Let's hope the Silero team can help us understanding what we are looking at.

@7r3nzy
Copy link
Contributor

7r3nzy commented Jun 6, 2023

More details on the conv node:

conv

tract output:
[loop] ┃┃┣┻ 57 If If_69
┃┃┃ [loop|loop] ┏ 2 Source input_data1
┃┃┃ [loop|loop] ┃ ━━━ ..,?
┃┃┃ [loop|loop] ┣┻ 1 Squeeze13 Squeeze_71
┃┃┃ [loop|loop] ━━━ Squeeze175_dim_0,Squeeze175_dim_1,Squeeze175_dim_2,F32
┃┃┃ [loop|else] ┏ 1 Source input_data1
┃┃┃ [loop|else] ┃ ━━━ ..,?
┃┃┃ [loop|else] ┣ 0 Identity Identity_72
┃┃┃ [loop|else] ━━━ Squeeze175_dim_0,Identity176_dim_1,Squeeze175_dim_1,Squeeze175_dim_2,F32
[loop] ┃┃┃┏ 134 Source model.feature_extractor.forward_basis_buffer
[loop] ┃┃┃┃ ━━━ 258,1,256,F32 0, 0.00015059065, 0.0006022719, 0.0013547717, 0.0024076367, 0.0037602326, 0.005411745, 0.0073611788, 0.00960736, 0.012148935, 0.014984373, 0.018111967...
[loop] ┃┃┣┻ 58 ConvHir Conv_73

[loop] ┃┃┣┓
[loop] ┃┃┃┣┻┻┻┻ 61 StridedSlice Slice_78
[loop] ┃┃┃┣┻ 62 Pow Pow_87
[loop] ┃┃┗┓
[loop] ┃┃┃┣┻┻┻┻ 59 StridedSlice Slice_84
[loop] ┃┃┃┣┻ 60 Pow Pow_89
[loop] ┃┃┣┻ 63 Add Add_90

These are just plain outputs without any analysis, I'll get back to you with more details.

@kali
Copy link
Collaborator Author

kali commented Jun 6, 2023

Guess dropping the "sonos" name in the issue at Silero generated some confusion and/or wrong expectations :)
Not sure we're gonna get much help from there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants