Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Merge r1.13.0 main (#5570) * update branch Signed-off-by: ericharper <[email protected]> * Rename Speech Dataset Processor to Speech Data Processor (#5378) Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Megatron Export Update (#5343) * export update for Megatron + change ORT optimization Signed-off-by: David Mosallanezhad <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated export_utils to use autocast instead of manually casting >:/ Signed-off-by: David Mosallanezhad <[email protected]> * removed dtype from LayerNorm Signed-off-by: David Mosallanezhad <[email protected]> * added comment Signed-off-by: David Mosallanezhad <[email protected]> * reverting changes on FloatCast Signed-off-by: David Mosallanezhad <[email protected]> * Cherry-picked changes from megatron-norm Signed-off-by: Boris Fomitchev <[email protected]> * updated asr_model import to cast_utils Signed-off-by: David Mosallanezhad <[email protected]> * updated del onnx_model place Signed-off-by: David Mosallanezhad <[email protected]> * changed ort optimization to basic -> temp fix Signed-off-by: David Mosallanezhad <[email protected]> Signed-off-by: David Mosallanezhad <[email protected]> Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: David Mosallanezhad <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Boris Fomitchev <[email protected]> * Disable sync_batch_comm in validation_step for GPT (#5397) * disable sync_batch_comm in validation_step Signed-off-by: ericharper <[email protected]> * Read sync_batch_comm from config or default to False Signed-off-by: Markel Sanz Ausin <[email protected]> * Update megatron_gpt_config to default sync_batch_comm to False to avoid CUDA error Signed-off-by: Markel Sanz Ausin <[email protected]> * Empty Signed-off-by: MaximumEntropy <[email protected]> * Comment out test Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: Markel Sanz Ausin <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Markel Sanz Ausin <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> * Radtts 1.13 (#5451) * [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue (#5358) * [TTS] add CI test for RADTTS training recipe. Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> * Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (#5339) (#5478) * Initial refactor Signed-off-by: MaximumEntropy <[email protected]> * Resolve config before passing to load_from_checkpoint Signed-off-by: MaximumEntropy <[email protected]> * Fixes for model parallel and nemo restore Signed-off-by: MaximumEntropy <[email protected]> * Fixes for eval Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert config changes Signed-off-by: MaximumEntropy <[email protected]> * Refactor Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix typo Signed-off-by: MaximumEntropy <[email protected]> * Remove comments Signed-off-by: MaximumEntropy <[email protected]> * Minor Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix validation reconfiguration Signed-off-by: MaximumEntropy <[email protected]> * Remove old comment Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixes for test_ds Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * export_utils bugfix (#5480) * updated export_utils Signed-off-by: David Mosallanezhad <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Mosallanezhad <[email protected]> Co-authored-by: David Mosallanezhad <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Export fixes for Riva (#5496) * Export fixes for Riva Signed-off-by: Boris Fomitchev <[email protected]> * Cleaning up training_utils Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: Boris Fomitchev <[email protected]> * added set_start_method + function param bugfix (#5539) * added set_start_method + function param bugfix Signed-off-by: David Mosallanezhad <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * upper bound torchmetrics Signed-off-by: ericharper <[email protected]> Signed-off-by: David Mosallanezhad <[email protected]> Signed-off-by: ericharper <[email protected]> Co-authored-by: David Mosallanezhad <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: ericharper <[email protected]> * remove notebook (#5548) Signed-off-by: ericharper <[email protected]> Signed-off-by: ericharper <[email protected]> * update readme Signed-off-by: ericharper <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert Signed-off-by: ericharper <[email protected]> * revert Signed-off-by: ericharper <[email protected]> * revert Signed-off-by: ericharper <[email protected]> * revert Signed-off-by: ericharper <[email protected]> * revert Signed-off-by: ericharper <[email protected]> * revert Signed-off-by: ericharper <[email protected]> * revert Signed-off-by: ericharper <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: David Mosallanezhad <[email protected]> Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: Markel Sanz Ausin <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: David <[email protected]> Co-authored-by: David Mosallanezhad <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Markel Sanz Ausin <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Optimized loop and bugfix in SDE (#5573) - Fixed bug with loading custom data attributes from JSON in Speech Data Explorer Signed-off-by: George Zelenfroynd <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Update torchmetrics (#5566) * add task arg Signed-off-by: nithinraok <[email protected]> * update state Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Co-authored-by: Taejin Park <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * remove useless files. (#5580) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * add initial NFA code Signed-off-by: Elena Rastorgueva <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Elena Rastorgueva <[email protected]> * Make use of the specified device during viterbi decoding Signed-off-by: Elena Rastorgueva <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Elena Rastorgueva <[email protected]> * Fix CodeQL notes Signed-off-by: Elena Rastorgueva <[email protected]> * Fix CodeQL warning Signed-off-by: Elena Rastorgueva <[email protected]> * Add an option to defer data setup from ``__init__`` to ``setup`` (#5569) * Add an option to defer dataloader setup from __init__ to setup Signed-off-by: Ante Jukić <[email protected]> * Updated doc Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Make utt_id specified by number of parts of audio_filepath user wishes to use Signed-off-by: Elena Rastorgueva <[email protected]> * remove audio_sr TODO - reduce risk of silent bugs Signed-off-by: Elena Rastorgueva <[email protected]> * Add check that model is CTC Signed-off-by: Elena Rastorgueva <[email protected]> * Remove unused import Signed-off-by: Elena Rastorgueva <[email protected]> * Text generation improvement (UI client, data parallel support) (#5437) * Squashed commit of the following: commit a5e124f34be31bd6eafe5e5fdf5bedcd0d50915c Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Thu Oct 13 15:07:42 2022 +0000 [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci commit 35b424044fe80c3081e7756ab21244f701716f7e Author: Yi Dong <[email protected]> Date: Thu Oct 13 08:04:49 2022 -0700 get rid of base Signed-off-by: Yi Dong <[email protected]> commit 2955210e2311791543538cfbb5ad26b79414c954 Merge: d52edef8c eaf6757ca Author: Yi Dong <[email protected]> Date: Thu Oct 13 13:17:02 2022 +0000 Merge branch 'universal_prompt' of github.com:NVIDIA/NeMo into universal_prompt commit d52edef8cd7b36593838fb270047e80f8ccb652e Author: Yi Dong <[email protected]> Date: Thu Oct 13 13:16:24 2022 +0000 align with main Signed-off-by: Yi Dong <[email protected]> commit eaf6757ca5be8e099492f57c81d984429b0ad49c Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Thu Oct 13 13:12:11 2022 +0000 [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci commit c4b86d97626ea0721bf8fb4c0a45dec5becc94c9 Author: Yi Dong <[email protected]> Date: Thu Oct 13 13:10:58 2022 +0000 same as main Signed-off-by: Yi Dong <[email protected]> commit e335de51bcc0d681c58b568c3d8c238bc5687c3b Merge: c231086e0 4463a9fe9 Author: Yi Dong <[email protected]> Date: Thu Oct 13 13:08:09 2022 +0000 Merge branch 'main' into universal_prompt commit c231086e057f1efaa915f691d84664cb3d5aad85 Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Wed Oct 12 19:59:12 2022 +0000 [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci commit 6a821a4b49a23dd3408a706a2a3dd393149b0bb1 Author: Yi Dong <[email protected]> Date: Wed Oct 12 19:56:17 2022 +0000 default to pad Signed-off-by: Yi Dong <[email protected]> commit 9d908e39fef1beed9ba2da4d1a6806161eb7ef25 Author: Yi Dong <[email protected]> Date: Wed Oct 12 19:55:44 2022 +0000 add the option to pad the tokens Signed-off-by: Yi Dong <[email protected]> commit 876dc395b43fdeeaa2bcbbe13c76523633764c33 Merge: fbb0f4035 fe3c77ee9 Author: Yi Dong <[email protected]> Date: Wed Oct 12 19:20:47 2022 +0000 Merge branch 'fix_global_init' into universal_prompt commit fe3c77ee93ab6cf3ea152db68cb6beefcac2a392 Author: Yi Dong <[email protected]> Date: Wed Oct 12 18:59:49 2022 +0000 fix import again Signed-off-by: Yi Dong <[email protected]> commit fbb0f4035c6cd6bfefed50a20605503de8c1dccb Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Wed Oct 12 16:00:24 2022 +0000 [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci commit 372ca8c0d7988f2339b15888dc72aa21f4fb6937 Author: Yi Dong <[email protected]> Date: Wed Oct 12 15:58:32 2022 +0000 enable server Signed-off-by: Yi Dong <[email protected]> commit cbe05d9fbc978f812cfbb671f45f147f300713c4 Author: Yi Dong <[email protected]> Date: Wed Oct 12 13:07:28 2022 +0000 fix comment error Signed-off-by: Yi Dong <[email protected]> commit 1948048922e726ec6131e44b1a745389f18d4ef2 Merge: 232c2cce3 984f5c09a Author: Yi Dong <[email protected]> Date: Wed Oct 12 13:05:30 2022 +0000 Merge branch 'fix_global_init' into universal_prompt commit 232c2cce34d7a8b902da406706f3dd9b39475091 Merge: 34c8a68df 658243fb6 Author: Yi Dong <[email protected]> Date: Wed Oct 12 12:50:00 2022 +0000 Merge branch 'fix_global_init' into universal_prompt commit 984f5c09a6dbf1d1fb5aa30ed9b0df188e66a50f Merge: 658243fb6 3fda5de46 Author: Yi Dong <[email protected]> Date: Wed Oct 12 08:42:11 2022 -0400 Merge branch 'main' into fix_global_init commit 658243fb6580191b5d60edd30cde16dcc23cbb85 Author: Yi Dong <[email protected]> Date: Wed Oct 12 12:40:57 2022 +0000 fix import error Signed-off-by: Yi Dong <[email protected]> commit 8e0fe1cad05ec288ec122b3cd0e139a96872e08c Author: Yi Dong <[email protected]> Date: Tue Oct 11 22:44:12 2022 +0000 update the fused kernel Signed-off-by: Yi Dong <[email protected]> commit 536cf6bef9447b75843fad630729c47a2fba35f3 Author: Yi Dong <[email protected]> Date: Tue Oct 11 14:44:52 2022 -0700 add the missing file Signed-off-by: Yi Dong <[email protected]> commit 1b437ec41dc5e354453ce0a089bca0171cbcb6c2 Author: Yi Dong <[email protected]> Date: Tue Oct 11 14:43:14 2022 -0700 fix fused softmax Signed-off-by: Yi Dong <[email protected]> commit 7813f60e05f9783af61f8c14ec1cb0c6c4f1f263 Author: Yi Dong <[email protected]> Date: Tue Oct 11 14:16:48 2022 -0700 move global step to base Signed-off-by: Yi Dong <[email protected]> commit 34c8a68df084b18d377e84415d9f07b2cd6673dd Author: Yi Dong <[email protected]> Date: Thu Oct 6 13:50:11 2022 +0000 fix pipeline for eval Signed-off-by: Yi Dong <[email protected]> commit eee5d38218f26660c3ffebe9f615c850c80a1f0d Author: Yi Dong <[email protected]> Date: Thu Oct 6 13:48:22 2022 +0000 fix for pipleline parallel Signed-off-by: Yi Dong <[email protected]> commit 323bca73e7ef6099ee79c0a2fffac7b709ed6c5d Merge: 125e49947 e3b4c4d1f Author: Yi Dong <[email protected]> Date: Wed Oct 5 19:29:13 2022 +0000 Merge branch 'universal_prompt' of github.com:NVIDIA/NeMo into universal_prompt commit 125e4994760448ff75dd9328395813eda1c87547 Author: Yi Dong <[email protected]> Date: Wed Oct 5 19:29:04 2022 +0000 add share option Signed-off-by: Yi Dong <[email protected]> commit e3b4c4d1f7346c9fa596f3cca6d4df0a9e05c368 Author: Yi Dong <[email protected]> Date: Wed Oct 5 11:43:48 2022 -0700 make sure consolidation works Signed-off-by: Yi Dong <[email protected]> commit a5c833964ecf05dc460ca1da69275c4019742150 Merge: 2a07ab52d abcb74be2 Author: Yi Dong <[email protected]> Date: Wed Oct 5 18:40:29 2022 +0000 Merge branch 'universal_prompt' of github.com:NVIDIA/NeMo into universal_prompt commit 2a07ab52d95f15ba666823028c69e23825666c05 Author: Yi Dong <[email protected]> Date: Wed Oct 5 18:40:23 2022 +0000 added requirement Signed-off-by: Yi Dong <[email protected]> commit 3abecd9dd1611993a87c537636abe7f7e6a9b04c Author: Yi Dong <[email protected]> Date: Wed Oct 5 18:39:42 2022 +0000 added a simple web server Signed-off-by: Yi Dong <[email protected]> commit abcb74be2caf1cdec40eb9ba2be4dde4d45a3b4b Author: Yi Dong <[email protected]> Date: Wed Oct 5 06:54:12 2022 -0700 fix empty val loss Signed-off-by: Yi Dong <[email protected]> commit b8eb92ac4a0d665570af75e34c9ba3c2e2420c26 Author: Yi Dong <[email protected]> Date: Tue Oct 4 19:25:30 2022 -0700 text gen working Signed-off-by: Yi Dong <[email protected]> commit d59f3e3f3a6fd19736d1c5706fed65a3dd4049ba Author: Yi Dong <[email protected]> Date: Tue Oct 4 16:08:40 2022 -0700 first change Signed-off-by: Yi Dong <[email protected]> commit 59d077585e6962a669b824af58f64e8a0bea6547 Author: Yi Dong <[email protected]> Date: Tue Oct 4 15:00:40 2022 -0700 revert Signed-off-by: Yi Dong <[email protected]> commit 12a0f3902d99e9179403644bd951c045df716ca7 Author: Yi Dong <[email protected]> Date: Tue Oct 4 21:26:23 2022 +0000 init imp Signed-off-by: Yi Dong <[email protected]> commit 62a15dfd943cc48be495ac61b9f2f00995775c5f Merge: 82c90d2cd e0cc6b767 Author: Yi Dong <[email protected]> Date: Tue Oct 4 11:58:26 2022 -0700 Merge branch 'main' into universal_prompt commit 82c90d2cd0fd156f16a4b899f8c741d598f33990 Author: Yi Dong <[email protected]> Date: Tue Oct 4 11:17:13 2022 -0700 add sync Signed-off-by: Yi Dong <[email protected]> commit 9819b703eef877d90cd1257bf3610c69de9b4d7e Author: Yi Dong <[email protected]> Date: Sun Oct 2 17:52:34 2022 -0700 fix save model Signed-off-by: root <[email protected]> commit e4937e2fc5fb7d70754c97668416e4a69c3079fe Author: Yi Dong <[email protected]> Date: Sat Oct 1 18:56:09 2022 +0000 working Signed-off-by: Yi Dong <[email protected]> commit b73b06d1c7cf5417a6d87cb33d8ed83a57e38b7b Author: Yi Dong <[email protected]> Date: Sat Oct 1 17:34:03 2022 +0000 calcuate the mask Signed-off-by: Yi Dong <[email protected]> commit 9db3bc13eb65a94a475b837603351da68e3745bc Author: Yi Dong <[email protected]> Date: Fri Sep 30 23:26:32 2022 +0000 fix bug in datasets Signed-off-by: Yi Dong <[email protected]> commit f289900375d4412f53f8110be00fec6587627550 Author: Yi Dong <[email protected]> Date: Fri Sep 30 22:29:40 2022 +0000 update the code Signed-off-by: Yi Dong <[email protected]> commit 8e28a1f208aabaab72dbe769e72756baada04d99 Author: Yi Dong <[email protected]> Date: Fri Sep 30 21:52:52 2022 +0000 added new ds Signed-off-by: Yi Dong <[email protected]> commit 8d41315bab7ce90e200a8a7d1023c34f8e046897 Author: Yi Dong <[email protected]> Date: Fri Sep 30 18:57:09 2022 +0000 added new files Signed-off-by: Yi Dong <[email protected]> commit 984e0e94e15e16323c1ba1ca2efeabd84f69463f Merge: cbe8b7ab1 fa6cd8588 Author: Yi Dong <[email protected]> Date: Thu Sep 29 21:43:29 2022 +0000 Merge branch 'llm-prompt-learning-improvements' into universal_prompt commit fa6cd858839277939446afe7275976078d54c512 Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Thu Sep 29 16:47:30 2022 +0000 [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci commit 78ba46e5d6fde1be53c08e1e30a54cce59824be0 Merge: 7d6d46742 8d670bc77 Author: Virginia Adams <[email protected]> Date: Thu Sep 29 09:43:27 2022 -0700 Merge branch 'main' into llm-prompt-learning-improvements commit 7d6d46742170a66758287a207d67e1b1bfd15613 Author: Virginia Adams <[email protected]> Date: Thu Sep 29 16:42:43 2022 +0000 Removed inference step and added sentence peice check to predict step Signed-off-by: Virginia Adams <[email protected]> commit 20fd265acd6f7f9912cf52155fe66ccfa6b201a2 Author: Virginia Adams <[email protected]> Date: Thu Sep 29 15:26:32 2022 +0000 fixed first stage check for pipeline parallel T5 pt Signed-off-by: Virginia Adams <[email protected]> commit 3637be2b258c8d9028856f9971edb7da4a8121f0 Merge: a3ea722fd 986a76612 Author: Virginia Adams <[email protected]> Date: Wed Sep 28 10:23:30 2022 -0700 Merge branch 'main' into llm-prompt-learning-improvements commit a3ea722fdc12fbcc5989b76ef5643a574b763bc4 Merge: 770967a52 971485ce7 Author: Virginia Adams <[email protected]> Date: Mon Sep 26 13:35:52 2022 -0700 Merge branch 'main' into llm-prompt-learning-improvements commit 770967a5251a474b6dcc2d44bf9a2076adbcb604 Merge: d23bf6c30 e3ac280a8 Author: Virginia Adams <[email protected]> Date: Mon Sep 26 10:17:03 2022 -0700 Merge branch 'main' into llm-prompt-learning-improvements commit d23bf6c30acc0e3f6af9b4e24547669866a34d62 Merge: de6a31651 333d2b749 Author: Virginia Adams <[email protected]> Date: Mon Sep 26 10:05:16 2022 -0700 Merge branch 'llm-prompt-learning-improvements' of https://github.com/NVIDIA/NeMo into llm-prompt-learning-improvements commit de6a31651e63d88a42b971794d93f18ff5a3cdff Author: Virginia Adams <[email protected]> Date: Mon Sep 26 17:00:53 2022 +0000 Updated PP check to be on first stage pipeline only Signed-off-by: Virginia Adams <[email protected]> commit 333d2b7498e6742ce66436f733c980a74616900c Merge: 592c0986a a39fc925a Author: Virginia Adams <[email protected]> Date: Fri Sep 23 16:11:21 2022 -0700 Merge branch 'main' into llm-prompt-learning-improvements commit 592c0986a476a91b57b8605d7b70830d7acfa021 Author: Virginia Adams <[email protected]> Date: Fri Sep 23 23:08:41 2022 +0000 Fixed unused import and CI test bug Signed-off-by: Virginia Adams <[email protected]> commit ea9cd82d85638bc60ae4ad7ef105db931c8e3455 Merge: ce4b72c8c b566c2d0e Author: Virginia Adams <[email protected]> Date: Fri Sep 23 18:57:25 2022 +0000 Merge branch 'llm-prompt-learning-improvements' of https://github.com/NVIDIA/NeMo into llm-prompt-learning-improvements commit ce4b72c8c52f32be336e323dd78a38089edc3e7c Author: Virginia Adams <[email protected]> Date: Fri Sep 23 18:57:16 2022 +0000 Switch to import from base class Signed-off-by: Virginia Adams <[email protected]> commit b566c2d0e35a068f758fd1310bc620a47be4590b Merge: 6621f2854 e872061ac Author: Virginia Adams <[email protected]> Date: Fri Sep 23 10:09:03 2022 -0700 Merge branch 'main' into llm-prompt-learning-improvements commit 6621f28543828a48484a5637f6c9f3ccb23a5b02 Author: Virginia Adams <[email protected]> Date: Wed Sep 14 20:47:35 2022 +0000 python format fix Signed-off-by: Virginia Adams <[email protected]> commit 8deafc8987b6af5f7b99a250310f57a40198c37f Author: Virginia Adams <[email protected]> Date: Wed Sep 14 20:28:02 2022 +0000 Save .nemo on new best val score Signed-off-by: Virginia Adams <[email protected]> commit 761bd36969cb465d6a129e9eee6ce1f883d3cf41 Author: Virginia Adams <[email protected]> Date: Wed Sep 14 18:03:19 2022 +0000 Added automatic checkpoint to nemo file method Signed-off-by: Virginia Adams <[email protected]> commit 3be4ed57b6cd3ddfe4876d78650dfe8fe794598b Author: Virginia Adams <[email protected]> Date: Wed Sep 14 02:11:56 2022 +0000 Make GPT use base prompt learning model class: Signed-off-by: Virginia Adams <[email protected]> Signed-off-by: Yi Dong <[email protected]> * fix LGTM Signed-off-by: Yi Dong <[email protected]> * fix validation Signed-off-by: Yi Dong <[email protected]> * change for the lm eval Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * make text generation work in data parallel environment Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * implement the service with rest service Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * surpress log Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Fixes Signed-off-by: MaximumEntropy <[email protected]> * Update config Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Restore function needed for NMT Signed-off-by: MaximumEntropy <[email protected]> * handles no answer only Signed-off-by: Yi Dong <[email protected]> * Fix config Signed-off-by: MaximumEntropy <[email protected]> * added knn to web Signed-off-by: Yi Dong <[email protected]> * fix lgtm.com comments Signed-off-by: Yi Dong <[email protected]> * output the retrieved context Signed-off-by: Yi Dong <[email protected]> * allow no neighbor query Signed-off-by: Yi Dong <[email protected]> * remove the imports Signed-off-by: Yi Dong <[email protected]> * warn only once Signed-off-by: Yi Dong <[email protected]> * Change output file format from JSON to JSONL Signed-off-by: MaximumEntropy <[email protected]> * new t0 dataset Signed-off-by: Yi Dong <[email protected]> * Add T0 data preproc scripts Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Merge and multiprocessing Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix for is_correct Signed-off-by: MaximumEntropy <[email protected]> * fix epoch > 2 Signed-off-by: Yi Dong <[email protected]> * handles multiple dataloader Signed-off-by: Yi Dong <[email protected]> * remove template Signed-off-by: Yi Dong <[email protected]> * Refactor T0 dataset Signed-off-by: MaximumEntropy <[email protected]> * Add script to merge train folder into individual training files to minimize number of blends Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added on the fly service Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add combo instance Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added combo service Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * send weights back to server Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix index store Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Minor changes Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add reset button Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add add eos Signed-off-by: Yi Dong <[email protected]> * use a seperate bert service Signed-off-by: Yi Dong <[email protected]> * no loss of accuracy Signed-off-by: Yi Dong <[email protected]> * pin the gradio version Signed-off-by: Yi Dong <[email protected]> * Remove bin compat Signed-off-by: MaximumEntropy <[email protected]> * Fix header lines Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * evaluate based on text generation Signed-off-by: Yi Dong <[email protected]> * exact match result aggregation Signed-off-by: Yi Dong <[email protected]> * working SP and SA Signed-off-by: Yi Dong <[email protected]> * sync Signed-off-by: Yi Dong <[email protected]> * fix checkpoint Signed-off-by: Yi Dong <[email protected]> * fix eval Signed-off-by: Yi Dong <[email protected]> * backup states Signed-off-by: Yi Dong <[email protected]> * backup states reset Signed-off-by: Yi Dong <[email protected]> * fix the bug Signed-off-by: Yi Dong <[email protected]> * fix evaluation for sentence piece Signed-off-by: Yi Dong <[email protected]> * fix a bug Signed-off-by: Yi Dong <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * potential fix in the future Signed-off-by: Yi Dong <[email protected]> * remove the universal codes Signed-off-by: Yi Dong <[email protected]> * remove universal strategy Signed-off-by: Yi Dong <[email protected]> * address reviewer comment Signed-off-by: Yi Dong <[email protected]> Signed-off-by: Yi Dong <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: MaximumEntropy <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Add align function docstrings and make most args optional Signed-off-by: Elena Rastorgueva <[email protected]> * Remove redundant returns of viterbi and log probs matrices Signed-off-by: Elena Rastorgueva <[email protected]> * Rename h# to <initial_silence> Signed-off-by: Elena Rastorgueva <[email protected]> * Update manifest format description in README Signed-off-by: Elena Rastorgueva <[email protected]> * always remove any spaces from utt_id Signed-off-by: Elena Rastorgueva <[email protected]> * Patch the hanging of threads on very large stderr (#5589) (#5590) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * O2 style amp for gpt3 ptuning (#5246) * enable amp o2 plugin Signed-off-by: Jimmy Zhang <[email protected]> * only create master param if param requires gradient Signed-off-by: Jimmy Zhang <[email protected]> * remove pytorch autocast Signed-off-by: Jimmy Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jimmy Zhang <[email protected]> * Update optimizer_with_main_params.py Signed-off-by: JimmyZhang12 <[email protected]> * create master grad only if param group requires grad Signed-off-by: Jimmy Zhang <[email protected]> * fix grad scaler for pp > 1 Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: Jimmy Zhang <[email protected]> Signed-off-by: JimmyZhang12 <[email protected]> Co-authored-by: Jimmy Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Better patch hydra (#5591) (#5592) * Readd buffereing and thread drain to Hydra Launcher Signed-off-by: smajumdar <[email protected]> * Readd buffereing and thread drain to Hydra Launcher Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: smajumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Elena Rastorgueva <[email protected]> * Yet another fix with hydra multirun (#5594) (#5595) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Add RETRO model documentation (#5578) * added retro doc Signed-off-by: Yi Dong <[email protected]> * finish data part Signed-off-by: Yi Dong <[email protected]> * added the data format Signed-off-by: Yi Dong <[email protected]> * added training script Signed-off-by: Yi Dong <[email protected]> * added training and evaluation steps Signed-off-by: Yi Dong <[email protected]> * edit the text Signed-off-by: Yi Dong <[email protected]> * added the images Signed-off-by: Yi Dong <[email protected]> * fix beginning Signed-off-by: Yi Dong <[email protected]> * fix the grammar Signed-off-by: Yi Dong <[email protected]> * trim it down Signed-off-by: Yi Dong <[email protected]> * add wandb option Signed-off-by: Yi Dong <[email protected]> * add reference Signed-off-by: Yi Dong <[email protected]> * fix path Signed-off-by: Yi Dong <[email protected]> * added the parameters table Signed-off-by: Yi Dong <[email protected]> * fix section Signed-off-by: Yi Dong <[email protected]> Signed-off-by: Yi Dong <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Fix: setup_multiple validation/test data (#5585) Fix: setup_multiple validation/test data (#5585) Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Move to optimizer based EMA implementation (#5169) * Move to optimizer Signed-off-by: SeanNaren <[email protected]> * Fix replacing weights Signed-off-by: SeanNaren <[email protected]> * Allow swapping of weights be optional Signed-off-by: SeanNaren <[email protected]> * Save 2 models Signed-off-by: SeanNaren <[email protected]> * Use different hook Signed-off-by: SeanNaren <[email protected]> * Expose cpu device Signed-off-by: SeanNaren <[email protected]> * Add clause to see if this fixes issue with O2 optimizer Signed-off-by: SeanNaren <[email protected]> * Try to get O2 working Signed-off-by: SeanNaren <[email protected]> * WIP Signed-off-by: SeanNaren <[email protected]> * Fixes Signed-off-by: SeanNaren <[email protected]> * Fixes to tests Signed-off-by: SeanNaren <[email protected]> * Add guard Signed-off-by: SeanNaren <[email protected]> * Remove import Signed-off-by: SeanNaren <[email protected]> * Add guard Signed-off-by: SeanNaren <[email protected]> * Add comment Signed-off-by: SeanNaren <[email protected]> * Remove overwrite Signed-off-by: SeanNaren <[email protected]> * Add BatchNorm, currently tests fail Signed-off-by: SeanNaren <[email protected]> * Fix tests/functionality for batch norm Signed-off-by: SeanNaren <[email protected]> * Get rid of NLP changes Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * AIStore for ASR datasets (#5462) AIStore for ASR datasets Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Add support for MHA adapters to ASR (#5396) * Convert AbstractAdapterModule to AbstractAdapterMixin Signed-off-by: smajumdar <[email protected]> * Temporary fixes to new signature of mixin Signed-off-by: smajumdar <[email protected]> * Add adapter util for constants, add all mha adapters. Signed-off-by: smajumdar <[email protected]> * Update name of function Signed-off-by: smajumdar <[email protected]> * Roll back changes to convASR Signed-off-by: smajumdar <[email protected]> * Convert AbstractAdapterModule to AbstractAdapterMixin Signed-off-by: smajumdar <[email protected]> * First draft of Conformer support for MHA attention Signed-off-by: smajumdar <[email protected]> * Add some preliminary tests Signed-off-by: smajumdar <[email protected]> * Add support for projection of the hidden dimension for attention Signed-off-by: smajumdar <[email protected]> * Add support for squeezeformer Signed-off-by: smajumdar <[email protected]> * Update train adapter config Signed-off-by: smajumdar <[email protected]> * Add tests for squeezeformer and unit tests for new modules Signed-off-by: smajumdar <[email protected]> * Update config for hp search,set limits on modules for conformer and squeezeformer, update adapter mixin, add cache to import_from_class_path Signed-off-by: smajumdar <[email protected]> * Update location of adapters Signed-off-by: smajumdar <[email protected]> * Add pre_norm for proper attention learning, Fix the issue with nan/inf in pos_bias_u and pos_bias_v Signed-off-by: smajumdar <[email protected]> * Update expmanager to clean up checkpoints Signed-off-by: smajumdar <[email protected]> * Fix style Signed-off-by: smajumdar <[email protected]> * Add docstrings and update tests Signed-off-by: smajumdar <[email protected]> * Add docstrings and update tests Signed-off-by: smajumdar <[email protected]> * Add docstrings and update tests Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update training scripts Signed-off-by: smajumdar <[email protected]> * Update config and docs Signed-off-by: smajumdar <[email protected]> * Expose nemo delete function Signed-off-by: smajumdar <[email protected]> * Correct adapter partial state saving Signed-off-by: smajumdar <[email protected]> * Correct a bug with state management of adapter tokens Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pull down EMA test Signed-off-by: smajumdar <[email protected]> * Correct name of adapter module utility class Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: smajumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Elena Rastorgueva <[email protected]> * Remove unused TTS eval functions w/ pesq and pystoi dependencies (#5605) (#5606) Signed-off-by: Jocelyn Huang <[email protected]> Signed-off-by: Jocelyn Huang <[email protected]> Signed-off-by: Jocelyn Huang <[email protected]> Co-authored-by: Jocelyn <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Create separator parameter Signed-off-by: Elena Rastorgueva <[email protected]> * Call align function with hydra config Signed-off-by: Elena Rastorgueva <[email protected]> * update usage example Signed-off-by: Elena Rastorgueva <[email protected]> * Update Dockerfile (#5614) (#5616) Pinned to use `numba==0.53.1` to avoid crashing in training with `num_workers > 0`. This is just a temporary workaround, still need to fix it in the future. Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Make separate pretrained_name and model_path parameters Signed-off-by: Elena Rastorgueva <[email protected]> * make "optional" tags bold in markdown Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Move non-main functions to utils dir Signed-off-by: Elena Rastorgueva <[email protected]> * Temp workaround: Disable test with cache_audio=True since it is failing in CI (#5607) (#5615) Signed-off-by: Ante Jukić <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * [TTS] fix ranges of char set for accented letters. (#5607) * [TTS] fix ranges of char set for accented letters. * remove digits pattern and added unit tests for math operators. Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Change success message to reduce confusion (#5621) Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Update documentation and tutorials for Adapters (#5610) * Improve docs for adapter and tests Signed-off-by: smajumdar <[email protected]> * Improve docs for adapter and tests Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Rename test file Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Elena Rastorgueva <[email protected]> * [TTS] add type hints and change varialbe names for tokenizers and g2p (#5602) * [TTS] add type hints and change variable names for tokenizers and g2p Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * 1. Added missing import for gather_objects. (#5627) Signed-off-by: Micha Livne <[email protected]> Signed-off-by: Micha Livne <[email protected]> Co-authored-by: Micha Livne <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * [TTS][ZH] add fastpitch and hifigan model NGC urls and update NeMo docs. (#5596) (#5625) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Fixed RadTTS unit test (#5572) Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * remove tests (#5633) Signed-off-by: ericharper <[email protected]> Signed-off-by: ericharper <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * [TTS][DOC] add notes about automatic conversion to target sampling rates. (#5624) (#5634) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Conformer local attention (#5525) * local attn and merge Signed-off-by: sam1373 <[email protected]> * optional Signed-off-by: sam1373 <[email protected]> * override Signed-off-by: sam1373 <[email protected]> * incorporate comments Signed-off-by: sam1373 <[email protected]> * update Signed-off-by: sam1373 <[email protected]> * fix Signed-off-by: sam1373 <[email protected]> * comment Signed-off-by: sam1373 <[email protected]> * changes, test Signed-off-by: sam1373 <[email protected]> * changes Signed-off-by: sam1373 <[email protected]> * check att context Signed-off-by: sam1373 <[email protected]> * readme link Signed-off-by: sam1373 <[email protected]> * utils Signed-off-by: sam1373 <[email protected]> * update Signed-off-by: sam1373 <[email protected]> Signed-off-by: sam1373 <[email protected]> Signed-off-by: Samuel Kriman <[email protected]> Co-authored-by: Vahid Noroozi <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Add core classes and functions for online clustering diarizer part 1 (#5526) * Add core classes and functions for online clustering diarizer Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add audio to labels code Signed-off-by: Taejin Park <[email protected]> * resolve type errors Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added unit=tests for very short audio Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Filled all missing docstrings Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * resolved conflict and added missing docstrings Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed unit-test errors Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix the wrongly added file - megatron_gpt_model.py Signed-off-by: Taejin Park <[email protected]> * Fix wrongly included file - megatron_gpt_model.py Signed-off-by: Taejin Park <[email protected]> * resolve code quality issue Signed-off-by: Taejin Park <[email protected]> * Fixed unit-test errors and bugs Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * changed total_sec for offline_clustering toy_data in unit-tests Signed-off-by: Taejin Park <[email protected]> * fixed merging index offset bug Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * only including part 1 files Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused function Signed-off-by: Taejin Park <[email protected]> * fixed unused imports Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * divided nmesc_clustering.py into two and reflected first-pass comments Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding offline/online_clustering.py Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix code QL autocomment Signed-off-by: Taejin Park <[email protected]> * Removed unused imports Signed-off-by: Taejin Park <[email protected]> * Update nemo/collections/asr/parts/utils/online_clustering.py Co-authored-by: Sean Naren <[email protected]> Signed-off-by: Taejin Park <[email protected]> * Reflected comments Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * resolved code scanning issue Signed-off-by: Taejin Park <[email protected]> * Update nemo/collections/asr/parts/utils/offline_clustering.py Co-authored-by: Sean Naren <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: Taejin Park <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Sean Naren <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * [STT] Add Esperanto (Eo) ASR Conformer-CTC and Conformer-Transducer models (#5639) (#5641) * add stt_eo_conformer_ctc_large model * stt_eo_conformer_transducer_large Co-authored-by: Andrei Andrusenko <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Removed unused import Signed-off-by: Elena Rastorgueva <[email protected]> * Specify that filepaths need to be absolute Signed-off-by: Elena Rastorgueva <[email protected]> * replaces any spaces in utt_id with dashes Signed-off-by: Elena Rastorgueva <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Elena Rastorgueva <[email protected]> * Make hydra script callable by another script Signed-off-by: Elena Rastorgueva <[email protected]> * do not specify default model or model_downsample_factor Signed-off-by: Elena Rastorgueva <[email protected]> * [Dockerfile] Remove AIS archive from docker image (#5629) Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Measure audio_sr from audio instead of needing to specify Signed-off-by: Elena Rastorgueva <[email protected]> * [TTS][ZH] Disambiguate polyphones with augmented dict and Jieba segmenter for Chinese FastPitch (#5541) * Chinese TTS replaces default pypinyin dict * Add jieba word segmenter as an option Signed-off-by: Yuekai Zhang <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Make separate parameters for device of transcription and viterbi steps Signed-off-by: Elena Rastorgueva <[email protected]> * Add mention of gecko Signed-off-by: Elena Rastorgueva <[email protected]> * [workflow] add exclude labels option to ignore cherry-picks in release changelog. (#5645) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * [TTS][ZH] bugfix for the tutorial and add NGC CLI installation guide. (#5643) (#5647) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * [Add] ASR+VAD Inference Pipeline (#5575) Added offline ASR+VAD inference pipeline that matches with what's in RIVA, along with some feature-based ASR and classification datasets. Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: fayejf <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * rename separator to ctm_grouping_separator and refactor Signed-off-by: Elena Rastorgueva <[email protected]> * Bert interleaved (#5556) * Adding SP and SAR support Bert * Adding Sequence parallel support to Bert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Adding Sequence parallel support to Bert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Adding SP and SAR support Bert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Adding SP and SAR support Bert * Adding SP and SAR support Bert * Adding Sequence parallel support to Bert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Adding Sequence parallel support to Bert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Adding Sequence parallel support to Bert * Update bert_model.py Signed-off-by: Shanmugam Ramasamy <[email protected]> * Adding tests * Adding interleaved pipeline parallelism * Adding interleaved pipeline parallelism * Adding interleaved pipeline parallelism * Adding interleaved pipeline parallelism * Adding interleaved pipeline parallelism * Adding interleaved pipeline parallelism * Adding interleaved pipeline parallelism * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Addressing Eric's comments * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Addressing Eric's comments * Fix bug fix sequence parallel and Interleaved * Fix bug fix sequence parallel and Interleaved Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Add duration padding support for RADTTS inference (#5650) * Added duration padding support for RADTTS inference * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: Kevin Shih <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Elena Rastorgueva <[email protected]> * Add remove_blank_tokens_from_ctm parameter Signed-off-by: Elena Rastorgueva <[email protected]> * Dont save initial_silence line in CTM Signed-off-by: Elena Rastorgueva <[email protected]> * Add DLLogger support to exp_manager (#5658) * Add DLLogger support to exp_manager Signed-off-by: Alexandre Milesi <[email protected]> * Move dllogger to separate file and check import Signed-off-by: Alexandre Milesi <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused import Signed-off-by: Alexandre Milesi <[email protected]> Signed-off-by: Alexandre Milesi <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * add minimum_timestamp_duration parameter Signed-off-by: Elena Rastorgueva <[email protected]> * add suggestion about removing blanks to README Signed-off-by: Elena Rastorgueva <[email protected]> * reorder args Signed-off-by: Elena Rastorgueva <[email protected]> * clarify description of ctm_grouping_separator in README Signed-off-by: Elena Rastorgueva <[email protected]> * update docstring Signed-off-by: Elena Rastorgueva <[email protected]> * [TTS][ZH] bugfix for ngc cli installation. (#5652) (#5664) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Port stateless timer to exp manager (#5584) * Port stateless timer to exp manager Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixes Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixes and remove from all megatron code Signed-off-by: MaximumEntropy <[email protected]> * Fixes Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change message Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Elena Rastorgueva <[email protected]> * Fix EMA restart by allowing device to be set by the class init (#5668) Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Remove SDP (moved to separate repo) - merge to main (#5630) * Remove sdp files from tools folder Signed-off-by: Elena Rastorgueva <[email protected]> * Add page to docs with new SDP location Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Add interface for making amax reduction optional for FP8 (#5447) * add TE interface for making amax reduction optional Signed-off-by: Kirthi Shankar Sivamani <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Kirthi Shankar Sivamani <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * [TTS] add tts dict cust notebook (#5662) * add tts dict cust notebook Signed-off-by: ekmb <[email protected]> * review Signed-off-by: ekmb <[email protected]> * fixed audio links Signed-off-by: ekmb <[email protected]> * remove old notebook Signed-off-by: ekmb <[email protected]> * fix typo Signed-off-by: ekmb <[email protected]> Signed-off-by: ekmb <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * [ASR] Audio processing base, multi-channel enhancement models (#5356) * Audio processing base model, enc-mask-dec enhancement, tests and modules Signed-off-by: Ante Jukić <[email protected]> * Addressed review comments Signed-off-by: Ante Jukić <[email protected]> * Fixed CodeQL warnings Signed-off-by: Ante Jukić <[email protected]> * Addressed PR comments Signed-off-by: Ante Jukić <[email protected]> * Addressed PR comments: - renamed AudioProcessingModel to AudioToAudioModel - various small modifications - updated unit tests Signed-off-by: Ante Jukić <[email protected]> * Addressed comments - Moved spectrogram to audio_preprocessing - Renamed MultichannelFeatures - Updated config and unit tests Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Expose ClusteringDiarizer device (#5681) * Expose device for users to set Signed-off-by: SeanNaren <[email protected]> * Expose device for users to set Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Add Beam Search support to ASR transcribe() (#5443) * Add support for beam decoding via high level API. Signed-off-by: smajumdar <[email protected]> * Add ctc decoding section Signed-off-by: smajumdar <[email protected]> * Update ctc transcribe API to return results from beam search Signed-off-by: smajumdar <[email protected]> * Add argument to preserve arpa file Signed-off-by: smajumdar <[email protected]> * Update script to use hydra config, add some support for future compute timesteps, add doc for ctc decoding Signed-off-by: smajumdar <[email protected]> * Update eval script and doc to use new API Signed-off-by: smajumdar <[email protected]> * Add tests for ctc greedy decoding Signed-off-by: smajumdar <[email protected]> * Address reviewer comments and add docstrings Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix changes and address comments Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Elena Rastorgueva <[email protected]> * Propagate attention_dropout flag for GPT-3 (#5669) * Propagate attention_dropout flag for GPT-3 Signed-off-by: Mikołaj Błaż <[email protected]> * Add default to megatron_gpt_config Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Mikołaj Błaż <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Eric Harper <complex451@gmail…
- Loading branch information