-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【PaddlePaddle Hackathon 4】:为maxout算子支持 float16 数据类型 #50976
Changes from 2 commits
5e60a80
fa549ac
cd8f23d
bf5d250
9befc8d
c35527e
32ffdc2
dc2e1f9
9356ea6
16fe004
1ad3701
5ffcb46
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,5 +15,10 @@ | |
#include "paddle/phi/core/kernel_registry.h" | ||
#include "paddle/phi/kernels/impl/maxout_grad_kernel_impl.h" | ||
|
||
PD_REGISTER_KERNEL( | ||
maxout_grad, GPU, ALL_LAYOUT, phi::MaxOutGradKernel, float, double) {} | ||
PD_REGISTER_KERNEL(maxout_grad, | ||
GPU, | ||
ALL_LAYOUT, | ||
phi::MaxOutGradKernel, | ||
float, | ||
phi::dtype::float16, | ||
double) {} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 反向kernel可能也需要调整为FP32计算精度,已降低精度的损失。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 1.意思是直接去掉 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 当前的修改只是给算子注册了fp16类型,但是看你并没有对kernel的实现做修改。 关于问题2,在官网文档中都有详细介绍。https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/dev_guides/amp_precision/amp_op_dev_guide_cn.html There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 理解了,已经将fp16单测与fp32单测对齐,测试方式和误差要求一致 |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -136,5 +136,47 @@ def test_errors(self): | |
self.assertRaises(ValueError, F.maxout, x_float32, 2, 2) | ||
|
||
|
||
@unittest.skipIf( | ||
not core.is_compiled_with_cuda(), "core is not compiled with CUDA" | ||
) | ||
class TestMaxOutOpFP16(OpTest): | ||
def setUp(self): | ||
self.op_type = "maxout" | ||
self.python_api = paddle.nn.Maxout | ||
input_np = np.random.uniform(-1, 1, [2, 6, 5, 4]).astype(np.float16) | ||
self.groups = 2 | ||
self.axis = 1 | ||
output_np = maxout_forward_naive(input_np, self.groups, self.axis) | ||
self.attrs = {'groups': self.groups, 'axis': self.axis} | ||
self.inputs = {'X': input_np} | ||
self.outputs = {'Out': output_np} | ||
|
||
def test_check_output(self): | ||
if core.is_compiled_with_cuda(): | ||
place = core.CUDAPlace(0) | ||
if core.is_float16_supported(place): | ||
self.check_output_with_place(place, atol=1e-3) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. test_check_output关于place及fp16支持情况的判断和TestMaxOutOpFP16的装饰器的使用保留一处应该就可以,上面的装饰器会在非GPU的测试环境自动跳过单测,所以下面的内容应该是执行不到的。 前向应该没有涉及到计算,只是数据的搬运?这里单测的阈值不设置能否通过? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 可以通过 |
||
|
||
def test_check_grad(self): | ||
place = core.CUDAPlace(0) | ||
if core.is_float16_supported(place): | ||
self.check_grad_with_place( | ||
place, ['X'], 'Out', max_relative_error=0.5 | ||
) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 如果使用装饰器的话,这里的place的判断应该不需要了。 这个max_relative_error需要设置这么大吗?需要结合反向kernel实现分析下是否有降低误差的可能 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
|
||
def set_attrs(self): | ||
pass | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这里的FP16单测可以继承TestMaxOutOp,对TestMaxOutOp做一些小的改动,比如支持设置dtype,shape,attrs,这样可以简化代码。 可以参考低精度单测规范中的介绍。https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/dev_guides/amp_precision/amp_test_dev_guide_cn.html There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
|
||
class TestMaxoutFP16Case1(TestMaxOutOpFP16): | ||
def set_attrs(self): | ||
self.axis = -1 | ||
|
||
|
||
class TestMaxoutFP16Case2(TestMaxOutOpFP16): | ||
def set_attrs(self): | ||
self.axis = 3 | ||
|
||
|
||
if __name__ == '__main__': | ||
unittest.main() |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -784,7 +784,7 @@ def maxout(x, groups, axis=1, name=None): | |
|
||
Parameters: | ||
x (Tensor): The input is 4-D Tensor with shape [N, C, H, W] or [N, H, W, C], the data type | ||
of input is float32 or float64. | ||
of input is float16, float32 or float64. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这个API实现中,有动静态图2个分支。静态图分支能否正常运行? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 已增加静态图分支的测试 |
||
groups (int): The groups number of maxout. `groups` specifies the | ||
index of channel dimension where maxout will be performed. This must be | ||
a factor of number of features. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我们目前仅需要为GPU支持fp16。CPU的实现不需要修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done