-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor asr_datamodule. #15
Conversation
Fbank(FbankConfig(num_mel_bins=80)) | ||
Fbank(FbankConfig(num_mel_bins=80), num_workers=4) | ||
if self.args.on_the_fly_feats | ||
else PrecomputedFeatures() | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this closing parenthesis has to be moved 3 lines up, so the code looks like:
input_strategy=OnTheFlyFeatures(
Fbank(FbankConfig(num_mel_bins=80), num_workers=4)
) if self.args.on_the_fly_feats
else PrecomputedFeatures(),
return_cuts=...
Currently when args.on_the_fly_feats = False, it tries to use OnTheFlyFeatures(PrecomputedFeatures()) which is an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
In that case, I think we should also change snowfall to fix that as this block of code is from snowfall.
See
https://github.com/k2-fsa/snowfall/blob/1f79957e9716c3f980c151df5b1d77bc4bb7ce78/egs/gigaspeech/asr/simple_v1/asr_datamodule.py#L337-L344
test = K2SpeechRecognitionDataset(
input_strategy=(
OnTheFlyFeatures(Fbank(FbankConfig(num_mel_bins=80)), num_workers=8)
if self.args.on_the_fly_feats
else PrecomputedFeatures()
),
return_cuts=self.args.return_cuts,
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, you’re right
# persistent_workers=False, | ||
# ) | ||
|
||
train_dl = LhotseDataLoader( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say be careful with LhotseDataLoader
-- it is experimental and I'm hoping to avoid needing to use it in the future. It overcomes some I/O issues with GigaSpeech, but for LibriSpeech you shouldn't see any difference in perf with a regular DataLoader.
The downside of LhotseDataLoader is that it doesn't have the elaborate shutdown mechanisms of PyTorch DataLoader and might leave your script running after the training has finished (i.e., everything runs ok, but the script doesn't exit by itself..).
input_strategy=OnTheFlyFeatures( | ||
Fbank(FbankConfig(num_mel_bins=80)) | ||
Fbank(FbankConfig(num_mel_bins=80), num_workers=4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For LibriSpeech, remove the num_workers
argument from OnTheFlyFeatures
-- it will attempt to spawn extra processes that are not needed for LibriSpeech (they help with GigaSpeech which has long OPUS recordings)
@@ -0,0 +1,362 @@ | |||
import argparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it makes sense -- but maybe it's sufficient to have a single copy of this script one level of directories up, and if any recipe requires non-standard processing, it would make it's own copy at the "current" directory level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about putting a symlink to other model directories to this file?
I was thinking that each model is as self-contained as possible.
If someone wants to modify this file, he/she can replace the symlink with a copy of this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that makes sense to me
try: | ||
num_batches = len(dl) | ||
except TypeError: | ||
num_batches = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_batches = '?'
which will display nicer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and batch_str
below won't need an extra if
I've fixed all the comments. @pzelasko Thanks and please accept the invitation for this repo. Ready to merge. |
It throws the following error:
@pzelasko Could you please have a look at it? Thanks
(I am using the latest lhotse, with the commit d24e6faa6f26a5034cebf1d97dc1bd933f285a03)
The refactoring is based on the asr datamodule from the gigaspeech recipe in snowfall.