Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

plot_network issue with LRN #10282

Closed
ThomasDelteil opened this issue Mar 27, 2018 · 11 comments
Closed

plot_network issue with LRN #10282

ThomasDelteil opened this issue Mar 27, 2018 · 11 comments

Comments

@ThomasDelteil
Copy link
Contributor

ThomasDelteil commented Mar 27, 2018

There is an issue in master that is not present in 1.1.0

Plotting the network on this tutorial:
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-10165/10/tutorials/onnx/inference_on_onnx_model.html
Fails with this error:

KeyErrorTraceback (most recent call last)
<ipython-input-15-3112c78b14de> in <module>()
----> 1 mx.visualization.plot_network(sym, shape={"input_0":inputs[0].shape}, node_attrs={"shape":"oval","fixedsize":"false"})

/home/ubuntu/anaconda3/envs/mxnet_p27/lib/python2.7/site-packages/mxnet/visualization.pyc in plot_network(symbol, title, save_format, shape, node_attrs, hide_weights)
    342                                 if "num_outputs" in params:
    343                                     key += str(int(params["num_outputs"]) - 1)
--> 344                             shape = shape_dict[key][1:]
    345                             label = "x".join([str(x) for x in shape])
    346                             attr["label"] = label

KeyError: u'lrn0_output'

The issue was introduced before mxnet==1.2.0b20180221

It is related to the fact that the LRN layer does not have an output, whilst in 1.1.0 I can do
sym.get_internals()['lrn0_output']

The issue was introduced by this PR and this commit c3e3a83:
#9677

@reminisce
Copy link
Contributor

Seems FListOutputNames attribute is not registered with string vector {"output", "tmp_norm"} for LRN operator as it was in legacy interface. I found the same issue with Convolution and FullyConnected a few days back. Need to scan the operator registrations in that PR to see if there are more missing. It could introduce backward incompatibility for applications depending on output names.

https://github.com/apache/incubator-mxnet/pull/9677/files#diff-3519c5391807f84a8d67bbf60b81bc92L197

@zheng-da

@zheng-da
Copy link
Contributor

zheng-da commented Mar 28, 2018

I'll fix the LRN op.

@reminisce do we really need to provide output names for convolution and fullyconnected? It seems the old interface doesn't provide output names, except for the operators that have multiple outputs.

BTW, it seems your fix has a bug.
https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/convolution.cc#L471
https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/fully_connected.cc#L270
Could you please fix it? It should use FListOutputNames.

@zheng-da
Copy link
Contributor

I checked all other operators. It seems others are fine.

@reminisce
Copy link
Contributor

@zheng-da The legacy interface has ListOutputs as a virtual function. It's defined in the base class Operator, so all the inheriting classes have this function if it's not implemented in themselves. I will fix the typo.

@reminisce
Copy link
Contributor

@zheng-da The funny thing is that even with the typo of writing FListOutputNames as FListInputNames in convolution and fc, it still gives me the correct output name as using legacy interface. It turns out that attribute FListOutputNames and FListInputNames are identical functions. But I will fix the typo anyway and add unit tests.

@zheng-da
Copy link
Contributor

@reminisce you are right. nnvm::FListInputNames defines the signature of the functor, so your code is fine. it's just a little confusing.

I'm still confused if we have to define FListOutputNames for all operators?

@zheng-da
Copy link
Contributor

@ThomasDelteil my PR (#10296) has been merged. Can you try it again?

@reminisce
Copy link
Contributor

@zheng-da There is no hard requirement. Doing this is for two reasons:

  1. Keep backward compatibility.
  2. It's sort of the convention of naming all the outputs of nn ops as "output". For all the other tensor ops, I don't observe this convention.

As you can see, there are multiple places in the code base following this convention. In quantization, we also use this to determine which layer should be calibrated or not.

I will submit a PR to fix the cosmetic issue with unit tests covering convolution, fc, and lrn.

@reminisce
Copy link
Contributor

@ThomasDelteil We have open-sourced MXBoard which enables users to visualize MXNet data (including network structures) in TensorBoard. The current version on PyPI is 0.1.0rc3. You can either install from source or pip install. The GitHub site has rich examples and tutorials on installation and how to use it in MXNet. Could you try it and give us feedbacks? Thanks.

@ThomasDelteil
Copy link
Contributor Author

I will try it next week for sure, looks great!

@ThomasDelteil
Copy link
Contributor Author

@zheng-da the fix is working the graph is displaying

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants