-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix 'BlasAXPBY uninplemented' error when run deepfefm with custom npu #47546
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
ffb3f8b
to
d6ae371
Compare
paddle/fluid/eager/CMakeLists.txt
Outdated
custom_operator_node | ||
${DEVICE_EVENT_LIBS} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里 DEVICE_EVENT_LIBS 的修改是不是误带进来的?没有看到哪里对 DEVICE_EVENT_LIBS 做了定义?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
经过测试,确实是误加进来的,已去除。
tensor); | ||
#ifdef PADDLE_WITH_CUSTOM_DEVICE | ||
} | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这段代码直接参考 https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/eager/grad_tensor_holder.cc#L145 修改就行,is_custom_device 是 tensor.h 里面的函数,及时不开 with custom device 的编译选项这个接口也存在且能返回正确结果,所以这里需要把代码用 #ifdef PADDLE_WITH_CUSTOM_DEVICE
包起来。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accumulation_node | ||
SRCS accumulation_node.cc | ||
DEPS gradient_accumulator grad_node_info) | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确认下这里为什么需要将 accumulation_node target 用 if(NOT (NOT WITH_PYTHON AND ON_INFER))
包起来?是说当编译 C++ inference lib 包的时候就不需要编译 accumulation_node 了吗?
根据 accumulation_node.cc 里面的修改,应该对 accumulation_node 添加 ${eager_deps}
的依赖就行?而且这里还删除了对 phi_api 的依赖是为什么呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.这里加条件是因为accumulation node依赖的dygraph_function在编inference时不会被编译,且accumulation node在inference中不会用到,所以选择在inference中不编译accumulation node。
2.之前删除phi_api是因为编译时会报错,怀疑是phi_api带来的循环依赖造成的,所以删除了phi_api。今天经过测试,不删除phi_api也不会再编译出错,准备加回来。
t_values, &tensor_values); | ||
#ifdef PADDLE_WITH_CUSTOM_DEVICE | ||
} | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上,参考 grad_tensor_holder.cc 修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是因为目前“BlasAXPBY uninplemented”的问题仅在使用custom device时出现,为了不对其他情况造成影响,所以使用了条件编译。
hook_utils | ||
SRCS hook_utils.cc | ||
DEPS phi tensor_utils autograd_meta grad_node_info utils accumulation_node) | ||
if(NOT (NOT WITH_PYTHON AND ON_INFER)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上这里,CMAKE修改的原因是?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
因为hook_utils依赖accumulation_node,而accumulation_node在inference时不编译,所以这里也加上了条件。
if(WITH_CUSTOM_DEVICE) | ||
set(test_deps ${eager_deps} ${generated_deps}) | ||
else() | ||
set(test_deps ${eager_deps}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里需要添加 generated_deps 的原因是?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
之前不加无法编译成功,今天经过测试发现不加也能编译成功,已去掉。
cc_test_old(test_egr_ds_tensor_wrapper SRCS tensor_wrapper_test.cc DEPS | ||
${eager_deps}) | ||
${test_deps}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里需要添加 generated_deps 的原因是?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8593b2f
to
a7e3748
Compare
1611bb0
to
12b7281
Compare
12b7281
to
7cdfe9a
Compare
PR types
Bug fixes
PR changes
Others
Describe
fix 'BlasAXPBY uninplemented' error when run deepfefm model with custom npu.
为了解决BlasAXPBY uninplemented的问题,在paddle/fluid/eager/accumulation/accumulation_node.cc中包含了paddle/fluid/eager/api/generated/eager_generated/forwards/dygraph_functions.h,但dygraph_functions在inference中没有编译,且accumulation node在inference中没有被用到,所以选择在inference中不编译accumulation node。同时修改了paddle/fulid/eager和paddle/fluid/eager/api/utils目录下的CmakeLists,因为其中包含或依赖了accumualtion node。
而paddle/fluid/eager/tests/data_structure_tests目录下的几个tests也需要依赖dygraph_functions,所以需要加上依赖。
paddle/fluid/eager/tests/task_tests目录下的几个tests也需要依赖dygraph_functions,但仅在windowsCI环境下体现,所以也顺道加上了依赖。这些改动参考了这个PR #41198
另外增加了单测以覆盖accumulation node中新增的add_ad_func