Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] about iterabledataset or webdataset #1543

Open
KyanChen opened this issue May 15, 2024 · 1 comment
Open

[Feature] about iterabledataset or webdataset #1543

KyanChen opened this issue May 15, 2024 · 1 comment

Comments

@KyanChen
Copy link

What is the feature?

希望能支持 IterableDataset 或者 webdataset 的处理方式。随着训练数据越来越大,采用基于webdataset的数据读取方式越来越重要。现有的基于map-style的dataset方式很容易卡io,使得GPU的性能无法完全发挥。绝大部分时间都在索引和读取训练数据。

Any other context?

No response

@zhouzaida
Copy link
Collaborator

你好,感谢你的反馈。IterableDataset 和 WebDataset 是挺有用的功能,但我们暂时没有计划支持它们,如果你感兴趣并且有时间,欢迎提 PR 支持。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants