-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
decouple activation function's type from model compression's process in SE_A, now tanh & gelu is both available. #1020
Conversation
…ssion's process in SE_A, now tanh & gelu is both available.
Codecov Report
@@ Coverage Diff @@
## devel #1020 +/- ##
===========================================
- Coverage 75.01% 64.28% -10.73%
===========================================
Files 86 5 -81
Lines 6924 14 -6910
===========================================
- Hits 5194 9 -5185
+ Misses 1730 5 -1725 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please notice that tests failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code indentation is inconsistent in multiple places, please check it.
Please revert changes from this commit to |
source/op/unaggregated_grad.cc
Outdated
// UnaggregatedDyDxOp<GPUDevice, T>); | ||
// UnaggregatedDyDxOp<GPUDevice, T>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plz revert all such format changes! thank you
source/op/unaggregated_grad.cc
Outdated
y.shape(), | ||
&dy_dx)); | ||
y.shape(), | ||
&dy_dx)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be reverted.
source/op/unaggregated_grad.cc
Outdated
y.shape(), | ||
&dy2_dx)); | ||
y.shape(), | ||
&dy2_dx)); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be reverted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz revert to exactly the same as before
source/op/unaggregated_grad.cc
Outdated
z.shape(), | ||
&dz_dx)); | ||
z.shape(), | ||
&dz_dx)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be reverted.
source/op/unaggregated_grad.cc
Outdated
z.shape(), | ||
&dz2_dx)); | ||
z.shape(), | ||
&dz2_dx)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be reverted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code mixed tab and space, using them in the same line. I just have normalized them.
# functype | ||
if activation_fn == ACTIVATION_FN_DICT["tanh"]: | ||
self.functype = 1 | ||
elif activation_fn == ACTIVATION_FN_DICT["gelu"]: | ||
self.functype = 2 | ||
else: | ||
raise RunTimeError("Unknown actication function type!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should support all activation functions in ACTIVATION_FN_DICT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it will be done in future work.
source/op/unaggregated_grad.cc
Outdated
y.shape(), | ||
&dy_dx)); | ||
y.shape(), | ||
&dy_dx)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz revert to exactly the same as before
source/op/unaggregated_grad.cc
Outdated
y.shape(), | ||
&dy2_dx)); | ||
y.shape(), | ||
&dy2_dx)); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz revert to exactly the same as before
source/op/unaggregated_grad.cc
Outdated
z.shape(), | ||
&dz_dx)); | ||
z.shape(), | ||
&dz_dx)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz revert to exactly the same as before
source/op/unaggregated_grad.cc
Outdated
z.shape(), | ||
&dz2_dx)); | ||
z.shape(), | ||
&dz2_dx)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz revert to exactly the same as before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Tab 4space Tab Tab Tab Tab Tab 4space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work @liangadam
…in SE_A, now tanh & gelu is both available. (deepmodeling#1020) * commit-message: decouple activation function's type from model compression's process in SE_A, now tanh & gelu is both available. * commit-message: modified code and passed unittest * commit-message: Format Document * commit-message :Format revert * commit-message: format change * commit-message: Format change Co-authored-by: HLA <[email protected]>
…pmodeling#1020) python3.10 support: collection.Mapping -> collection.abc.Mapping repair this: deepmodeling/dpgen#1019
I modified 3 pieces of code and wrote a brief unittest 'source/tests/test_tabulate.py' .
Now, when the 'activation_function' is designated as 'tanh' or 'gelu' in descriptor SE_A, model compression can produce a reasonable model.
Moreover, cause the type of activation function is fully decoupled from the process of model compression, now we can add any other function type we need in a very easy way.