-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyTorch ONNX export #1333
Comments
The initial version is done already ( |
One aspect is that convolution with |
As a form of summary: the tool is mainly done, but there are some operators that aren't yet supported, one of which is the convolutional layer padding when specified as a string ( I found that it's not actually I've been investigating this issue and I'm currently trying to fix it. There's some people online having the same issues and they also provide implementations of custom padding, so it shouldn't be very hard to fix. I'll try to push a version with my changes before the weekend. |
|
Yes it is, but it seems that internally any padding given as a string triggers some parameter called Anyway, Also, padding is given as a tuple of maximum 3 dimensions |
Yes, my question was just to confirm this. So we can just always do this then, no matter if ONNX or so. Just: if padding == "valid":
padding = 0
We already have the code for adding the padding manual. See the code. It basically does: if padding == "same" and <manual_padding_required>:
... # add padding to input
padding = "valid" Nothing else needs to be done, or not? |
Btw, next step after the current demo works: Test our RF Conformer implementation. |
Actually, I just wonder now: Instead of manually adding the padding to the input, could we maybe just set |
I was able to insert |
Internally, the if self.padding_mode != 'zeros':
return F.conv1d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
weight, bias, self.stride,
_single(0), self.dilation, self.groups) where Or maybe we could cache it somehow, or add an additional layer attribute |
Change |
This is only for the case |
Ah you're right, I confused |
I'm not exactly sure what you mean. The padding numbers don't need to be calculated at all. |
Oh I see, yes, I neglected the effect of the dynamic dimensions on convolution, I was thinking more like image processing where the input is usually the same. Nevermind. |
I've pushed some commits that should fix some of the issues with the exporting tool. Hopefully I explained everything nicely. With the current state of the tool (4785332), I can properly run both configs by the ONNX exporting tool. However, the graphs differ greatly. The number of operations in the RF graph is much bigger than the number of operations in the PT graph. At first sight, the PT graph looks much cleaner than the RF graph. I haven't studied the differences yet, I'll look at it on Monday. In the meantime, you can find both graphs below. |
That's very interesting. Thanks for this. We should study this in detail. I already pushed a few things which should reduce this a bit. Some comments from a quick look:
|
I tried exporting again the RF demo model with the current changes, and the graph is much smaller now! Basically almost equal to the PyTorch graph: onnx_rf_graph_new.txt 🥳 |
The actual computation is not just almost identical, but exactly identical, as far as I can see it. In addition, it defines the output sizes. Which are just the same as the input sizes, because of padding "same", so it's just an identity function here. I think when there are new calculations of dims, this will still be unnecessary complex. I think we can improve this more. But not sure how important this is. |
I think I reopen this to keep track of the remaining issues. One remaining issue now is that we do not really check |
Edit: you beat me to it 🙂 There seems to be one last thing to figure out before the tool is fully operational. I get an error while running the PyTorch demo:
This is the full stack_trace. I think this is happening because somehow the RF model is outputting the output sizes as well, while the PT model isn't. PT and RF don't exactly behave the same when calling Since in the PT demo we don't have access to batch/time dims, the output dimensions are instantiated to actual values. However, this isn't the case in the RF demo, where we have access to the shape of the original tensor as I introduced a workaround in 4785332, which is why I was able to post the PT graph above, but it was reverted on bb40b4c. Is there any other way to fix this? |
I would maybe extend |
This comment was marked as resolved.
This comment was marked as resolved.
And if they don't match, which is what we're seeing here, what to do? We've detected the error, but should we fix it, or do we leave the user to it? If we fix it, do we give priority to the expected outputs, or to the actual outputs? I would argue that the actual outputs would have more importance, but it's an "unexpected" situation in which maybe an automatic fix isn't the most complete thing to do... |
This comment was marked as resolved.
This comment was marked as resolved.
Now, we are not seeing this here. If we do that as I described, there would also not be an error, as they would actually match then. But if they do not match, we would just throw an error. What we are seeing is an error from ONNX, which is not what I mean.
It's just an error if the outputs do not match. We don't need to handle this further. |
Regarding getting rid of the with warnings.catch_warnings():
# PyTorch JIT tracing will produce this warning:
# TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect.
# We can't record the data flow of Python values,
# so this value will be treated as a constant in the future.
# This means that the trace might not generalize to other inputs!
# This is fine for us, that we do this check only once, and then not at later runtime anymore.
warnings.filterwarnings("ignore", message="")
# <now the original code which produced the warning> However I'm not sure if this is the best way. Also, this should be encapsulated somehow in a nicer way, as the triggering code is actually not PyTorch specific. Or we just ignore it for now. |
I understood now. This setup fails with the expected warning that you say. |
As a completely different note, a constant value in the constructor of
This is the warning I'm worried about with the current code that you fixed. I understand that my previous implementation was bugged because of the reasons you mentioned, but with the creation of a tensor we're adding another two warnings. I'm worried by the fact that, since the call to
It doesn't seem like a constant to me, but maybe we could simply get rid of the warning by casting with |
So this looks like a bug in the ONNX export of PyTorch, right? Because running the PyTorch code as-is works completely fine. Can you make a bug report and link it here? |
Yes I changed the code again. I thought that |
You mean that it's a bug because the operation is converted to the same Edit: although I believe our original usage of
Indeed, there's no such warning 👍 the only warnings are the following ones, which we already knew of their existence:
|
I was able to create a better minimal example detailing each of the three cases:
Each of the three gives different warnings, but as I briefly stated above, converting by both ways yields the same graph. Not converting doesn't yield the same graph with respect to the other two, but I think the only difference is the lack of |
No. It's a bug simply because the PyTorch works without ONNX, but then fails in ONNX export. No matter the details, that already can be considered a bug (or maybe some missing functionality). You said that your example code (this) shows this. You should not mix different aspects in the bug report, like converting int32 to int64 or so. Keep separate issues as separate issues. In the issue, the example code is one of the most important aspects. You directly should show the code additionally to linking the Gist. In the issue, you formatted the error in a way that you cannot easily read it. I think somehow you removed the newlines from it? It would be helpful to add back the line breaks such that it can be read more easily. In the issue, you now made the example code more complex than it needs to be. But such example code should always be as minimal as possible. Why do you need those What is also missing in the issue is the error you get when you try to actually run it with ONNX. |
I will reformat the issue as discussed. Edit: done, I think the issue is clearer now, thanks for the feedback. |
Thanks. But as you wrote it, it is still a bit unclear to me. Are there two types of bugs, for option 1 and option 2? Then that should be two separate bug reports. Or are they the same bug? Or is there actually no bug with option 1? Then why is option 1 relevant for the bug report?
So, this is option 1? So with option 1, all works as intended? Then you should not put option 1 even into the code. The code should demonstrate the bug. I think this is confusing. You can mention in a remark that it works correctly when you call Btw, you forgot to enable Python syntax highlighting for the code. It's hard to read. I cannot easily see what part is the commented out code. But anyway, you better should not post any code where there is commented out code. Btw, |
I've formatted the issue as specified. Moving on to another topic, the inference seems to be working, or at least it's producing some reasonable output from the random input I gave it. However, I get two warnings: 2023-06-30 13:33:33.516802915 [W:onnxruntime:, execution_frame.cc:857 VerifyOutputSizes] Expected shape from model of {} does not match actual shape of {3} for output output:size0
2023-06-30 13:33:33.517066362 [W:onnxruntime:, execution_frame.cc:857 VerifyOutputSizes] Expected shape from model of {-1} does not match actual shape of {3,50,2} for output output:size1 As I understand, the tool is failing to get the output shapes. I'm running the ONNX exporting tool with the torch demo. The code I'm running for inference is the following: import onnxruntime as ort
import torch
onnx_model = "/tmp/nbeneitez/returnn/demos/demo-torch/model.005.onnx"
torch.manual_seed(42)
dummy_data = torch.randn([3, 50, 9])
dummy_data_len = torch.IntTensor([3, 50, 1])
session = ort.InferenceSession(onnx_model)
outputs_onnx = session.run(None, {"data": dummy_data.numpy(), "classes:size0": dummy_data_len.numpy()})
print(torch.FloatTensor(outputs_onnx[0]))
print(torch.IntTensor(outputs_onnx[1]))
print(torch.IntTensor(outputs_onnx[2])) I adapted the inference code from this code. |
The example code from the issue is still a bit too complicated. You don't need the On your code, you are using it incorrectly. |
Thanks for the heads-up. I tried inserting the batch size and the sequence lengths in
Indeed, in the graph we obtained when the tool initially worked (original comment here), the only two inputs I can see are A better view of the graph through through NETRON: Edit: but besides that, the code is working properly :) I just had to remove
And no warning is outputted when running inference. |
Who do you ask here? You say this is maybe unexpected behavior of the PyTorch ONNX export tool? Yes, maybe. Did you check whether this is discussed anywhere? If not, can you maybe open an issue on this, and ask if this is a bug or expected behavior? Maybe the answer is, this is expected behavior. Or maybe it would be fixed in a future version but we probably also want to make it work for the current version. So then the answer goes back to us. What do we actually want to do in such case? I think we want that we always have a clear interface. When the interface is that we give it (input, in_seq_lens), we want that the resulting ONNX graph can always get this type of input, no matter if it is actually used or not. Btw, via this argument, we might also say: The interface is that the output should be (output, out_seq_lens). And out_seq_lens should be properly defined. So then our current heuristic to just fill it with dummy values is not good, and we want that the user makes it explicit? We should maybe discuss with @curufinwe about this. It matters for RASR, what type of interface to expect there. Also when RASR wants to do batched recognition at some point, and when RASR wants to support different sequence lengths properly, with potential downsampling in the model, the in_seq_lens and out_seq_lens are both important. Btw, the batch_dim in the input is actually sth which we might want to remove, as it is redundant. In general, any dims which have a scalar dyn_size_ext are not needed. We probably should change this. |
I pushed tests for inference to a new branch, please let me know if something needs to be changed. I don't have more time for the week to work in the ONNX exporting but I quickly went through your comment and overall I agree, we should bring up this up with more people. And yes, I would rather have a constant interface always, even if the inputs end up calculating nothing, since that can be better automatically handled. I will try to examine some issues or details further next week. |
I don't remember anymore, what was this about? Or just about extending our tests by actually forwarding sth through the ONNX graph via Why did you rename Why is there no PR for this branch? |
As you said, this was just about extending our tests by actually forwarding something through the ONNX graph via
I understand that
I've been focused on other tasks and didn't have time to invest on the ONNX tool. I'll address the change above and create the PR, but I can't do much more. |
Last week, we decided: ONNX input/output consistency, three use cases for RASR:
User should explicitly specify each of the cases. So, what this means: actually, in case of RASR, it was not clear yet. Probably RASR can automatically detect whether the model has some seq lens input, and probably also whether the model has some seq lens output, so it can determine automatically which case it has. However, in case of RETURNN, this explicitly need to be specified. The further question was, how. We basically use the current existing mechanism, i.e. via |
@vieting found a bug in how we call Specifically, we must pass An open question is, what should be the epoch and step you pass to it? I think this should just be arguments. Edit Fixed. Edit Fixed again. Now we just take |
Btw, I wonder, the resulting ONNX graph, this is a freezed graph, so it contains all the parameters? (@Icemole) Edit: Yes. |
This implements proper dynamic dims handling as it was discussed #1333. It also extends some test cases and documentation. - Test cases: Also perform ONNX inference on the exported ONNX models. - Handle `available_for_inference` in `extern_data`: They are excluded. - Handle duplicated dims in `extern_data`: Only the first is used. (First according to the order defined by the dict in the config.) - Handle scalar dynamic dims in `extern_data` and `model_outputs`: They are excluded. - It is an error if some dim `dyn_size_ext` in `model_outputs` is specified as not being a scalar (e.g. common seq lens of shape [B]) but its value is not defined. - Demos are adapted. - Torch-to-ONNX export documentation extended. Co-authored-by: Albert Zeyer <[email protected]>
This issue is to track any aspects and issues on PyTorch (#1120) ONNX export.
export_to_onnx.py
)torch_export_to_onnx.py
model_outputs
.model_outputs
dims when not specified inmark_as_output
(esp in case of PT).data:size0
), or any scalar dyn_size_ext, should not be an input, as it is redundant. (Fixed via Torch ONNX export, proper dynamic dims handling #1362.)model_outputs
: Ignore dim tag equality. Check static dims, dyn dims informationOther things which are less important for now:
TracerWarning
s for some sanity checking code: I think it's not really possible in an easy way.The text was updated successfully, but these errors were encountered: