Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【PaddlePaddle Hackathon 2】113、深度体验飞桨单机转分布式训练全流程,并产出一份评估报告 #40552

Closed
TCChenlong opened this issue Mar 15, 2022 · 2 comments

Comments

@TCChenlong
Copy link
Contributor

TCChenlong commented Mar 15, 2022

(此 ISSUE 为 PaddlePaddle Hackathon 第二期活动的任务 ISSUE,更多详见 【PaddlePaddle Hackathon 第二期】任务总览

【任务说明】

【提交内容】

  • 设计文档,并提 PR 至 PaddlePaddle/community 的 rfcs 目录
  • 一份飞桨单卡转分布式使用评估表格,按照一下转换步骤完成打卡,并反馈每个步骤的问题和卡点;完成转换任务的使用体验和感受,可提供3-5条印象深刻的坑和场景。
  • 请将报告和表格一并提交至 Paddle/docs 的 docs/eval 目录下;提交AI Studio 任务链接,完成相关代码同步。

从单机单卡训练转为分布式数据并行训练需要如下步骤:

序号 核心步骤 完成情况(成功/不成功) 遇到问题 解决方法(无法解决需注明)
1 导入分布式训练需要的依赖包
2 初始化分布式环境
3 设置分布式训练需要的优化器
4 数据集拆分
5 构建训练代码
6 单机多卡分布式训练
7 多机多卡分布式训练

【技术要求】

  • 熟练掌握 Paddle,并了解其他深度学习框架

【答疑交流】

  • 如果在开发中对于上述任务有任何问题,欢迎在本 ISSUE 下留言交流。
  • 对于开发中的共性问题,在活动过程中,会定期组织答疑,请大家关注官网&QQ群的通知,及时参与
@paddle-bot-old
Copy link

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档常见问题历史IssueAI社区来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQGithub Issue and AI community to get the answer.Have a nice day!

@TCChenlong TCChenlong assigned TCChenlong and unassigned LiuChiachi Mar 15, 2022
@TCChenlong TCChenlong changed the title 【PaddlePaddle Hackathon 2】113、深度体验飞桨分布式训练功能,并产出一份评估报告 【PaddlePaddle Hackathon 2】113、深度体验飞桨单机转分布式训练全流程,并产出一份评估报告 Mar 15, 2022
@paddle-bot paddle-bot bot closed this as completed Apr 4, 2023
@paddle-bot
Copy link

paddle-bot bot commented Apr 4, 2023

Since you haven't replied for more than a year, we have closed this issue/pr.
If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up.
由于您超过一年未回复,我们将关闭这个issue/pr。
若问题未解决或有后续问题,请随时重新打开,我们会继续跟进。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants