Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Question about testing weight sharing of two resnets in test_gluon_estimator.py #18372

Closed
acphile opened this issue May 20, 2020 · 5 comments · Fixed by #18387
Closed

Question about testing weight sharing of two resnets in test_gluon_estimator.py #18372

acphile opened this issue May 20, 2020 · 5 comments · Fixed by #18387

Comments

@acphile
Copy link
Contributor

acphile commented May 20, 2020

https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_gluon_estimator.py#L423
Actually val_net.output does not share the same parameters with net.output in this case.
And I don't understand what the test case is aimed for. Why only share the output parameters to test weight sharing of two resnets? @liuzh91

@liuzh47
Copy link
Contributor

liuzh47 commented May 21, 2020

Why only share the output parameters to test weight sharing of two resnets?

What I want to do is I need estimator support net for training and val_net for validation on dev dataset. These two networks could be

  • Exactly same. i.e., With the same architecture and network weights as in the first test case
  • Partially same. i.e. With the same architecture and share only a subset of network parameters as in the second resnet case
  • just identical in the architecture but with totally different weights.
  • In some extreme case, their architecture may also be different. Then you may need to modify evaluate_batch

My first test case already cover the first case, so I use a partially weight shared resnet to test the second case. The third the fourth cases are trivial. So I didn't write new test case for that. But what do you mean by val_net.output does not share the same weight with net.output?

@acphile
Copy link
Contributor Author

acphile commented May 22, 2020

So is Line 420-423 meant to share the same parameters between net.output and val_net.output (other parts are different)? If so, actually Line 420-423 fails to share parameters:

net = gluon.model_zoo.vision.resnet18_v1(pretrained=False, ctx=ctx)
net.output = gluon.nn.Dense(10)
val_net = gluon.model_zoo.vision.resnet18_v1(pretrained=False, ctx=ctx)
val_net.output = gluon.nn.Dense(10, params=net.collect_params())
>>> print(val_net.output.weight)
Parameter resnetv10_weight (shape=(10, 0), dtype=float32)
>>> print(net.output.weight)
Parameter dense0_weight (shape=(10, 0), dtype=float32)
>>> print(net.output.weight.data==val_net.output.weight.data)
False

@liuzh47
Copy link
Contributor

liuzh47 commented May 22, 2020

I think as of the time of testing, you have not initialized network weights. Would you mind testing net.output.weight.data==val_net.output.weight.data after both net and val_net are both initialized?

@leezu
Copy link
Contributor

leezu commented May 22, 2020

@liuzh91 these appear to be two separate parameters, as their prefixes differ: resnetv10_weight vs dense0_weight

@liuzh47
Copy link
Contributor

liuzh47 commented May 22, 2020

@liuzh91 these appear to be two separate parameters, as their prefixes differ: resnetv10_weight vs dense0_weight

It should be val_net.output = gluon.nn.Dense(10, params=net.output.collect_params()).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants