-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
it takes too long for DynamicBucketingSampler to load state dict #1327
Comments
Unfortunately, yes. Restoring state of the sampler is unfortunately quite tricky to do quickly, and I don’t recommend using this technique with large data. Instead, it’s easier to discard the sampler state and change the random seed to randomize the training data. |
Thank you for your reply. I have another question I would like to ask, the question is that during the training of large scale data, I use |
No, CPU RAM usage should be bounded by buffer_size setting in the sampler. |
Why does the cpu memory continue to increase during training until it is full? Is it the problem of h5file? How can I free up memory? |
Are you using HDF5 files? We have a workaround fix in ASR dataset class but IIRC it only slows down the memory leak. You can try to use Lhotse Shar format instead, or LilcomChunkyWriter which are free from these issues. For large data, Lhotse Shar is recommended as it is much more io efficient. |
When I retrained 30,000 hours of data from checkpoint, it took a long time to load state dict for DynamicBucketingSampler(more than 2 hours).It's it normal ?
here is my code:
The text was updated successfully, but these errors were encountered: