Skip to content

Commit

Permalink
joeys2t documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
may- committed Jan 21, 2024
1 parent 85c4daf commit 3a6f59a
Show file tree
Hide file tree
Showing 3 changed files with 165 additions and 15 deletions.
65 changes: 50 additions & 15 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,14 @@ joeynmt.config module
:show-inheritance:


joeynmt.data_augmentation module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. automodule:: joeynmt.data_augmentation
:members:
:undoc-members:
:show-inheritance:


joeynmt.data module
^^^^^^^^^^^^^^^^^^^
.. automodule:: joeynmt.data
Expand Down Expand Up @@ -96,6 +104,24 @@ joeynmt.encoders module
:show-inheritance:


joeynmt.helpers_for_audio module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. automodule:: joeynmt.helpers_for_audio
:members:
:undoc-members:
:show-inheritance:


joeynmt.helpers_for_ddp module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. automodule:: joeynmt.helpers_for_ddp
:members:
:undoc-members:
:show-inheritance:


joeynmt.helpers module
^^^^^^^^^^^^^^^^^^^^^^

Expand All @@ -105,6 +131,15 @@ joeynmt.helpers module
:show-inheritance:


joeynmt.hub_interface module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. automodule:: joeynmt.hub_interface
:members:
:undoc-members:
:show-inheritance:


joeynmt.initialization module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand All @@ -114,6 +149,15 @@ joeynmt.initialization module
:show-inheritance:


joeynmt.loss module
^^^^^^^^^^^^^^^^^^^

.. automodule:: joeynmt.loss
:members:
:undoc-members:
:show-inheritance:


joeynmt.metrics module
^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -176,28 +220,19 @@ joeynmt.training module
:show-inheritance:


joeynmt.vocabulary module
^^^^^^^^^^^^^^^^^^^^^^^^^

.. automodule:: joeynmt.vocabulary
:members:
:undoc-members:
:show-inheritance:


joeynmt.loss module
^^^^^^^^^^^^^^^^^^^
joeynmt.transformer_layers module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. automodule:: joeynmt.loss
.. automodule:: joeynmt.transformer_layers
:members:
:undoc-members:
:show-inheritance:


joeynmt.transformer_layers module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
joeynmt.vocabulary module
^^^^^^^^^^^^^^^^^^^^^^^^^

.. automodule:: joeynmt.transformer_layers
.. automodule:: joeynmt.vocabulary
:members:
:undoc-members:
:show-inheritance:
110 changes: 110 additions & 0 deletions docs/source/benchmarks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,116 @@ Benchmarks
We provide several pretrained models with their benchmark results.


JoeyS2T
-------


* For ASR task, we compute WER (lower is better)
* For MT and ST task, we compute BLEU (higher is better)


LibriSpeech 100h
^^^^^^^^^^^^^^^^

+------------------------------------------------------------------------+--------------+-----------+-----------+------------+------------+---------+-------------------------------------------+
| System | Architecture | dev-clean | dev-other | test-clean | test-other | #params | download |
+========================================================================+==============+===========+===========+============+============+=========+===========================================+
| `Kahn etal <https://arxiv.org/abs/1909.09116>`_ | BiLSTM | 14.00 | 37.02 | 14.85 | 39.95 | \- | |
+------------------------------------------------------------------------+--------------+-----------+-----------+------------+------------+---------+-------------------------------------------+
| `Laptev etal <https://arxiv.org/abs/2005.07157>`_ | Transformer | 10.3 | 24.0 | 11.2 | 24.9 | \- | |
+------------------------------------------------------------------------+--------------+-----------+-----------+------------+------------+---------+-------------------------------------------+
| `ESPnet <https://huggingface.co/pyf98/librispeech_100h_transformer>`__ | Transformer | 8.1 | 20.2 | 8.4 | 20.5 | \- | |
+------------------------------------------------------------------------+--------------+-----------+-----------+------------+------------+---------+-------------------------------------------+
| `ESPnet <https://huggingface.co/pyf98/librispeech_100h_conformer>`__ | Conformer | 6.3 | 17.4 | 6.5 | 17.3 | \- | |
+------------------------------------------------------------------------+--------------+-----------+-----------+------------+------------+---------+-------------------------------------------+
| JoeyS2T | Transformer | 10.18 | 23.39 | 11.58 | 24.31 | 93M | :joeynmt2:`librispeech100h.tar.gz` (948M) |
+------------------------------------------------------------------------+--------------+-----------+-----------+------------+------------+---------+-------------------------------------------+


LibriSpeech 960h
^^^^^^^^^^^^^^^^

+-----------------------------------------------------------------------------------------------+--------------+-----------+-----------+------------+------------+---------+-------------------------------------------+
| System | Architecture | dev-clean | dev-other | test-clean | test-other | #params | download |
+===============================================================================================+==============+===========+===========+============+============+=========+===========================================+
| `Gulati etal <https://arxiv.org/abs/2005.08100>`_ | BiLSTM | 1.9 | 4.4 | 2.1 | 4.9 | \- | \- |
+-----------------------------------------------------------------------------------------------+--------------+-----------+-----------+------------+------------+---------+-------------------------------------------+
| `ESPnet <https://github.com/espnet/espnet/tree/v.202207/egs2/librispeech/asr1#without-lm>`__ | Conformer | 2.3 | 6.1 | 2.6 | 6.0 | \- | \- |
+-----------------------------------------------------------------------------------------------+--------------+-----------+-----------+------------+------------+---------+-------------------------------------------+
| `SpeechBrain <https://huggingface.co/speechbrain/asr-transformer-transformerlm-librispeech>`_ | Conformer | 2.13 | 5.51 | 2.31 | 5.61 | 165M | \- |
+-----------------------------------------------------------------------------------------------+--------------+-----------+-----------+------------+------------+---------+-------------------------------------------+
| `fairseq S2T <https://huggingface.co/facebook/s2t-small-librispeech-asr>`_ | Transformer | 3.23 | 8.01 | 3.52 | 7.83 | 71M | \- |
+-----------------------------------------------------------------------------------------------+--------------+-----------+-----------+------------+------------+---------+-------------------------------------------+
| `fairseq wav2vec2 <https://huggingface.co/facebook/wav2vec2-base-960h>`_ | Conformer | 3.17 | 8.87 | 3.39 | 8.57 | 94M | \- |
+-----------------------------------------------------------------------------------------------+--------------+-----------+-----------+------------+------------+---------+-------------------------------------------+
| JoeyS2T | Transformer | 10.18 | 23.39 | 11.58 | 24.31 | 102M | :joeynmt2:`librispeech960h.tar.gz` (1.1G) |
+-----------------------------------------------------------------------------------------------+--------------+-----------+-----------+------------+------------+---------+-------------------------------------------+


MuST-C ASR pretraining (WER)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| System | train | eval | dev | tst-COMMON | tst-HE | #params | download |
+========================================================================+===============+=======+=======+=======+============+========+=========+=====================================+
| `Gangi etal <https://cris.fbk.eu/retrieve/handle/11582/319654/29817/3045.pdf>`_ | v1 | v1 | \- | 27.0 | \- | \- | |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| `ESPnet <https://github.com/espnet/espnet/tree/v.202207/egs/must_c/asr1/RESULTS.md>`__ | v1 | v1 | \- | 12.70 | \- | \- | |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| :fairseq:`fairseq S2T <speech_to_text/docs/mustc_example.md>` | v1 | v1 | 13.07 | 12.72 | 10.93 | 29.5M | |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| :fairseq:`fairseq S2T <speech_to_text/docs/mustc_example.md>` | v1 | v2 | 9.11 | 11.88 | 10.43 | 29.5M | |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| JoeyS2T | v2 | v1 | 18.09 | 18.66 | 14.97 | 96M | |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| JoeyS2T | v2 | v2 | 9.77 | 12.51 | 10.73 | 96M | :joeynmt2:`mustc_asr.tar.gz` (940M) |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+


MuST-C MT pretraining (BLEU)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| System | train | eval | dev | tst-COMMON | tst-HE | #params | download |
+========================================================================+===============+=======+=======+=======+============+========+=========+=====================================+
| `Gangi etal <https://cris.fbk.eu/retrieve/handle/11582/319654/29817/3045.pdf>`_ | v1 | v1 | \- | 25.3 | \- | \- | |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| `Zhang etal <https://aclanthology.org/2020.findings-emnlp.230/>`_ | v1 | v1 | \- | 29.69 | \- | \- | |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| `ESPnet <https://github.com/espnet/espnet/tree/v.202207/egs/must_c/asr1/RESULTS.md>`__ | v1 | v1 | \- | 27.63 | \- | \- | |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| JoeyS2T | v2 | v1 | 21.85 | 23.15 | 20.37 | 66.5M | |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| JoeyS2T | v2 | v2 | 26.99 | 27.61 | 25.26 | 66.5M | :joeynmt2:`mustc_mt.tar.gz` (729M) |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+


MuST-C end-to-end ST (BLEU)
^^^^^^^^^^^^^^^^^^^^^^^^^^^

+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| System | train | eval | dev | tst-COMMON | tst-HE | #params | download |
+========================================================================+===============+=======+=======+=======+============+========+=========+=====================================+
| `Gangi etal <https://cris.fbk.eu/retrieve/handle/11582/319654/29817/3045.pdf>`_ | v1 | v1 | \- | 17.3 | \- | \- | |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| `Zhang etal <https://aclanthology.org/2020.findings-emnlp.230/>`_ | v1 | v1 | \- | 20.67 | \- | \- | |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| `ESPnet <https://github.com/espnet/espnet/tree/v.202207/egs/must_c/st1/RESULTS.md>`__ | v1 | v1 | \- | 22.91 | \- | \- | |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| :fairseq:`fairseq S2T <speech_to_text/docs/mustc_example.md>` | v1 | v2 | 22.05 | 22.70 | 21.70 | 31M | |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| JoeyS2T | v2 | v1 | 21.06 | 20.92 | 21.78 | 96M | |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+
| JoeyS2T | v2 | v2 | 24.26 | 23.86 | 23.86 | 96M | :joeynmt2:`mustc_st.tar.gz` (952M) |
+----------------------------------------------------------------------------------------+-------+-------+-------+------------+--------+---------+-------------------------------------+

sacrebleu signature: `nrefs:1|case:mixed|eff:no|tok:13a|smooth:exp|version:2.1.0`

.. note::

For MuST-C, we trained our model on the English-German subset of version 2, and evaluated the model both on version 1 and version 2 ``tst-COMMON``, ``and tst-HE splits``. See :notebooks:`benchmarks.ipynb` to replicate these results.


JoeyNMT v2.x
------------

Expand Down
5 changes: 5 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,9 @@
github_repo_main_url = f"{github_url}/{github_repo_slug}/blob/main"
github_repo_issues_url = f"{github_url}/{github_repo_slug}/issues"

download_url = "https://www.cl.uni-heidelberg.de/statnlpgroup/joeynmt2"
fairseq_url = "https://github.com/facebookresearch/fairseq/blob/v0.12.2/examples"

extlinks = {
"joeynmt": (f"{github_repo_main_url}/joeynmt/%s", "%s"),
"scripts": (f"{github_repo_main_url}/scripts/%s", "%s"),
Expand All @@ -207,6 +210,8 @@
"issue": (f"{github_repo_issues_url}/%s", "#%s"),
"pr": (f"{github_repo_url}/pull/%s", "PR #%s"),
"commit": (f"{github_repo_url}/commit/%s", "%s"),
"joeynmt2": (f"{download_url}/%s", "%s"),
"fairseq": (f"{fairseq_url}/%s", "%s"),
}


Expand Down

0 comments on commit 3a6f59a

Please sign in to comment.