Add MMI training with word pieces as modelling unit. #6

csukuangfj · 2021-08-07T09:09:18Z

Will post the results once they are available.

pzelasko · 2021-09-28T11:30:26Z

IIRC the alignments from alimdl did not help before when we checked them in snowfall; do you expect different results with the current setup?

danpovey · 2021-09-28T11:36:24Z

We are thinking they might help training get started, where that's a problem, e.g. for MMI with BPE.

csukuangfj · 2021-09-28T11:47:08Z

I just started the MMI training with pre-computed alignments.

The tensorboard logs are:

without attention decoder: https://tensorboard.dev/experiment/RsOln3DOSROOTQw9qDvcGA/#scalars&_smoothingWeight=0
with attention decoder: https://tensorboard.dev/experiment/J00TO00nQpSyzV6tgc6u2Q/

Without attention decoder

It throws the following warnings at some point (after several hundred batches):

At some other point, it stops printing the above warnings and the MMI loss starts to decrease:

You can see that pre-computed alignment is helpful to make the training converge.
(We will see whether it will diverge later)

csukuangfj · 2021-10-18T06:37:20Z

The best WER I get for this pull request is

Training without attention decoder

(decoding using whole-lattice-rescoring, i.e., HLG 1-best decoding + 4-gram LM rescoring)

test-clean: 2.79
test-other: 6.39

Training attention decoder

(decoding using attention decoder for rescoring)

test-clean: 2.82
test-other: 6.67

LF-MMI + attention decoder seems not as good as CTC + attention decoder.
I will do more experiments on it after finishing the decoding script for #54

Let's merge it first since it contains code for integrating framewise alignment information into training, which can
be used by @danpovey

pzelasko · 2021-10-18T14:09:51Z

egs/librispeech/ASR/conformer_mmi/asr_datamodule.py

@@ -0,0 +1,356 @@
+# Copyright      2021  Piotr Żelasko


Are there any change in this file? Was it supposed to be a symlink like the other asr_datamodule.py in conformer_ctc?

It is the same as the one in conformer_ctc and tdnn_lstm. I should have placed a symlink here.

pzelasko · 2021-10-18T14:11:49Z

icefall/lexicon.py

@@ -142,69 +205,66 @@ def tokens(self) -> List[int]:
        return ans


-class BpeLexicon(Lexicon):
+class UniqLexicon(Lexicon):


Why is it named UniqLexicon? Not sure how to interpret it.

Uniq here means each word in the lexicon has only one pronunciation, i.e., a unique pronunciation.

In BPE based lexicons, each word can be decomposed in a deterministic way.
In phone based lexicons, if a word has more than one pronunciation, there are scripts to keep only the first one.

pzelasko · 2021-10-18T14:13:01Z

icefall/mmi.py

+            func = _compute_mmi_loss_pruned
+        else:
+            func = _compute_mmi_loss_exact_non_optimized
+            #  func = _compute_mmi_loss_exact_optimized


Is this intended to be commented out?

Yes, the non_optimized version is easier to understand and consumes less memory.

csukuangfj added 10 commits July 31, 2021 15:55

Fix an error in TDNN-LSTM training.

c9222bd

Merge remote-tracking branch 'dan/master' into style-check

c72a11e

WIP: Refactoring

1fa3099

Refactor transformer.py

f6091b1

Remove unused code.

2be7a0a

Minor fixes.

a6d9b3c

Fix decoder padding mask.

b1b21eb

Merge remote-tracking branch 'dan/master' into mmi

f03c991

Add MMI training with word pieces.

897307f

Remove unused files.

03242b3

csukuangfj changed the title ~~Add MMI training with word pieces as modelling unit.~~ WIP: Add MMI training with word pieces as modelling unit. Aug 7, 2021

Minor fixes.

56319b0

danpovey mentioned this pull request Sep 2, 2021

RuntimeError: Specified device cuda:0 does not match device of data cuda:-2 #33

Closed

csukuangfj added 4 commits September 24, 2021 11:38

Merge remote-tracking branch 'dan/master' into mmi

59b7140

Refactoring.

6f5d634

Minor fixes.

9e6bd0f

Use pre-computed alignments in LF-MMI training.

94daaee

Minor fixes.

28f1aab

Lzhang-hub mentioned this pull request Oct 11, 2021

CUDA out of memory in decoding #70

Open

csukuangfj added 4 commits October 18, 2021 14:38

Update decoding script.

b8dbad5

Merge remote-tracking branch 'dan/master' into mmi

d7023c3

Add doc about how to check and use extracted alignments.

f383666

Fix style issues.

00dac43

csukuangfj changed the title ~~WIP: Add MMI training with word pieces as modelling unit.~~ Add MMI training with word pieces as modelling unit. Oct 18, 2021

csukuangfj added 2 commits October 18, 2021 15:00

Fix typos.

0663b97

Fix style issues.

3ac9b45

csukuangfj added the ready label Oct 18, 2021

Disable macOS tests for now.

f76ef6e

csukuangfj merged commit 53b79fa into k2-fsa:master Oct 18, 2021

pzelasko reviewed Oct 18, 2021

View reviewed changes

danpovey mentioned this pull request Nov 27, 2021

Decoding error 'Fsa' object doesn't support assignment. #133

Open

wwxm0523 mentioned this pull request Jan 30, 2022

LF-MMI GPU OOM #196

Open

ahazned mentioned this pull request Apr 13, 2022

Illegal memory error when training with multi-GPU #247

Open

csukuangfj deleted the mmi branch July 28, 2023 02:39

ngoel17 mentioned this pull request Sep 30, 2024

Illegal memory access during zipformer training #1764

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MMI training with word pieces as modelling unit. #6

Add MMI training with word pieces as modelling unit. #6

csukuangfj commented Aug 7, 2021

pzelasko commented Sep 28, 2021

danpovey commented Sep 28, 2021

csukuangfj commented Sep 28, 2021 •

edited

Loading

csukuangfj commented Oct 18, 2021

pzelasko Oct 18, 2021

csukuangfj Oct 19, 2021

pzelasko Oct 18, 2021

csukuangfj Oct 19, 2021 •

edited

Loading

pzelasko Oct 18, 2021

csukuangfj Oct 19, 2021

Add MMI training with word pieces as modelling unit. #6

Add MMI training with word pieces as modelling unit. #6

Conversation

csukuangfj commented Aug 7, 2021

pzelasko commented Sep 28, 2021

danpovey commented Sep 28, 2021

csukuangfj commented Sep 28, 2021 • edited Loading

Without attention decoder

csukuangfj commented Oct 18, 2021

Training without attention decoder

Training attention decoder

pzelasko Oct 18, 2021

Choose a reason for hiding this comment

csukuangfj Oct 19, 2021

Choose a reason for hiding this comment

pzelasko Oct 18, 2021

Choose a reason for hiding this comment

csukuangfj Oct 19, 2021 • edited Loading

Choose a reason for hiding this comment

pzelasko Oct 18, 2021

Choose a reason for hiding this comment

csukuangfj Oct 19, 2021

Choose a reason for hiding this comment

csukuangfj commented Sep 28, 2021 •

edited

Loading

csukuangfj Oct 19, 2021 •

edited

Loading