Skip to content

Files

Latest commit

 

History

History
executable file
·
47 lines (28 loc) · 2.72 KB

README.md

File metadata and controls

executable file
·
47 lines (28 loc) · 2.72 KB

deep-griffinlim-iteration

PyTorch implementation for Deep Griffin-Lim Iteration paper

Usage

All configurations are in hparams.py.

create.py saves STFT of of the speech data.

To train DNN, use python main.py --train.

To test DNN, use python main.py --test.

Model Change

Unlike the paper, the DNN model contains BatchNorm layers and Conv layers with larger (7x7) kernel size.

If the hyperparameter depth is greater than 1, the model performs the deep-griffinlim iteration depth times, and use separate DNN for each iteration.

If the repeat argument of the forward method is greater than 1, the model repeats the depth iterations repeat times by reusing the DNN models.

(To use the same single DNN for all iterations, set depth=1 and repeat>1.)

Setting out_all_block to True makes the forward method returns all outputs of repeat iteraions. If the output of the -th iteration is , the loss function is defined as

.

Inverse Short-time Fourier Transform (iSTFT)

The iSTFT implementation using PyTorch is in model/istft.py.

The function signature convention is the same as torch.stft. The implementation is based on librosa.istft.

There is a test code under if __name__ == '__main__' to prove the result of this implementation is the same as that of librosa.istft.

The file istft.py doesn't have any dependency on the other files in this repository, and only depends on PyTorch.

Requirements

  • python >= 3.7 (because of dataclass)
  • MALTAB engine for Python (because of the PESQ, STOI calculation)
  • PyTorch >= 1.2 (because of the tensorboard support)
  • tensorboard
  • numpy
  • scipy
  • tqdm
  • librosa