This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[Gluon] [Fix] Fix HybridBlock when hybridize is not called #16465
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line introduced in #16280 is not compatible with the previous context handling. Previously, always
x.context
is used as default context. https://github.com/apache/incubator-mxnet/pull/16280/files#diff-29da832c2145752f3906a2f71c7b63baL982Does it matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is chosen like this because x can be None now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it should be set to the first non-
None
argument, not the last?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, all ctxs are supposed to be the same. For example, we should not allow the mixing of cpu and gpu contexts. However, we currently allow to do so because we will need to mix
cpu
,cpu_pinned
, andcpu_shared
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thus we should use the first non-
None
argument not to break backwards compatibility?cpu, cpu_pinned, cpu_shared
are different contexts after allThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@leezu I think using the first or last non-None argument does not matter much here. Our goal is to make sure that we will finally pick a meaningful context for the parameters. In fact, the previous implementation has not checked whether the contexts of the arguments are valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the previous implementation hasn't enforced all contexts being equal, we shouldn't start picking a different array to determine the context. As you stated above, it's valid to use a mix of
cpu, cpu_pinned, cpu_shared
contexts.For example, after your change,
cpu_pinned
orcpu_shared
may be picked as default context instead ofcpu
if the user passed acpu_pinned
orcpu_shared
as last argument. The extra overhead could cause a performance regression (all parameters will be made available under default context).No need to risk this given there is no advantage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@leezu It's also possible that, previously cpu_pinned is picked as the default argument and after the change, the correct cpu context is picked as the default. My point is we need to probably give special treatment of the
cpu, cpu_pinned, cpu_shared
. What's your opinion?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@leezu I agree that the backward-compatible issue is valid. Let me first make it to be backward-compatible. However, this does not fix the issue of the cpu, cpu_pinned, cpu_shared combination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should get rid of choosing one array and using it's context as default context. For parameters, users should get the array via
self.weight.data(ctx)
. For the time being I suggest not to break the behaviour, to avoid unintended consequences