Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CustomDevice] GetCCLComm add custom device support #47168

Merged
merged 4 commits into from
Oct 31, 2022

Conversation

ronny1996
Copy link
Contributor

PR types

Others

PR changes

Others

Describe

GetCCLComm add custom device support

@ronny1996 ronny1996 force-pushed the custom_device_ccl branch 15 times, most recently from 21a2f80 to 77e9d11 Compare October 25, 2022 12:56
#include "paddle/fluid/distributed/collective/ProcessGroupCustom.h"
#endif

namespace phi {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在distributed下加phi的namespace这个操作有点奇怪,既然单独出来了,不能用distributed的namespace吗

Copy link
Contributor Author

@ronny1996 ronny1996 Oct 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个接口在 phi/kernel/sync_batch_norm_kernel.h以phi命名空间导出的,写到distributed对用户而言有点奇怪

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果反过来,让phi/kernel/sync_batch_norm_kernel.h里调用collective下的这个方法呢

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collective的方法对外都是隐藏的,现在 GetCCLComm定义在 phi/kernel/sync_batch_norm_kernel.h,实现在 paddle/fluid/distributed/collective/CommUtils.cc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

那实现可以先统一放在phi下吗

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

更新了

@ronny1996 ronny1996 requested review from LiYuRio and removed request for LiYuRio October 26, 2022 11:12
@ronny1996 ronny1996 requested review from LiYuRio and removed request for LiYuRio October 27, 2022 08:32
LiYuRio
LiYuRio previously approved these changes Oct 28, 2022
Copy link
Contributor

@LiYuRio LiYuRio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -0,0 +1,65 @@
// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

麻烦遵守code-style的文件命名规范
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个文件也不适合放到phi/kernels目录下,建议移到backends目录下

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

更新了

@ronny1996 ronny1996 merged commit 34d13d6 into PaddlePaddle:develop Oct 31, 2022
@ronny1996 ronny1996 deleted the custom_device_ccl branch October 31, 2022 06:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants