Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于训练结果提示AvgLossD下降过程中nan与生成结果黑屏的问题 #31

Open
JJcat0924 opened this issue Mar 4, 2024 · 1 comment

Comments

@JJcat0924
Copy link

老师您好,我在尝试复现项目内容的过程中出现了以下问题:当我运行03f脚本训练DG-Fonts时,连续两次发现当200 epochs的数据训练到三分之一左右(两次训练分别是67与76),AvgLoss_D突然由逐渐下降(1.22)变为nan,在此之后生成的效果全为黑屏(如图所示)。我查看了main.py中的内容,没有发现限制loss下降的情况。
我在虚拟机训练时为了适配环境,将脚本文件中的CUDA_VISIBLE_DEVICES参数置为0,nproc_per_node参数置为1,生成过程中的崩溃是否会与该配置的修改有关?
068_false

@wangchi95
Copy link
Owner

@JJcat0924 这两个参数是DDP相关配置,可以调高,我试过8卡的都是正常的
可以先看看是不是训练集的问题?主要检查一下有没有纯白的图片(相关ttf缺少这个字符)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants