Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Examples for demonstrating the usage and incremental value of TorchData Nodes #1352

Open
divyanshk opened this issue Nov 4, 2024 · 3 comments · May be fixed by #1371
Open

[WIP] Examples for demonstrating the usage and incremental value of TorchData Nodes #1352

divyanshk opened this issue Nov 4, 2024 · 3 comments · May be fixed by #1371

Comments

@divyanshk
Copy link
Contributor

divyanshk commented Nov 4, 2024

🚀 The feature

Starting this issue to track minimal examples we can create to demonstrate effective usage and value of TorchData nodes. I can create separate issues for each of these as required.

Motivation, pitch

  1. Vanilla torch.utils dataloader usage ported over to torchdata nodes
  2. GPU accelerated transforms
  3. Flexible parallelism (mixing multiprocessing with multithreading)
  4. Examples porting over popular OSS datasets
    • connecting to popular cloud storage
  5. Example creating new nodes (might get covered through examples above)
  6. Basic multimodal model trained E2E using torchdata nodes
  7. Chaining multiple transforms (might get covered through examples above)
  8. Dataset mixing (with different sampling strategies)

Alternatives

No response

Additional context

No response

@andrewkho
Copy link
Contributor

Discussions:
let's do 1, 2, 4 but with HF, 6 through torchtune, and 8

@andrewkho
Copy link
Contributor

Also add tasks for Documentation, READMEs, docstrings, param lists, design doc

@andrewkho
Copy link
Contributor

Contribution guide (lower priority)

@ramanishsingh ramanishsingh linked a pull request Nov 22, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants