Implementation of Automatic Speech Recognition inspired by Listen, Attend and Spell and Attention Is All You Need papers in PyTorch

Trained on LibriSpeech
Encoder-Decoder architecture with attention
Encoders:
- 2D Conv network over log-mel spectrogram
- Followed by several GRU layers
- Or followed by several self-attention layers
Decoders:
- GRU layers with dot-product attention over encoder
- Self-attention layers with dot-product attention over encoder

Name		Name	Last commit message	Last commit date
Latest commit History 271 Commits
config		config
data		data
test		test
.gitignore		.gitignore
README.md		README.md
attention.py		attention.py
check_infer.py		check_infer.py
dataset.py		dataset.py
decoder.py		decoder.py
encoder.py		encoder.py
metrics.py		metrics.py
model.py		model.py
modules.py		modules.py
requirements.txt		requirements.txt
sampler.py		sampler.py
train.py		train.py
transforms.py		transforms.py
utils.py		utils.py
vocab.py		vocab.py
windowed_softmax.py		windowed_softmax.py

Provide feedback