Skip to content

Commit

Permalink
Add release v1.0.0
Browse files Browse the repository at this point in the history
  • Loading branch information
tarepan committed Sep 4, 2023
1 parent fbacc6f commit 272eb8f
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 10 deletions.
17 changes: 9 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@
Predict subjective speech score with only 2 lines of code, with various MOS prediction systems.

```python
predictor = torch.hub.load("tarepan/SpeechMOS:main", "utmos22_strong", trust_repo=True)
score = predictor(wave, sample_rate)
# xx, good quality speech!
predictor = torch.hub.load("tarepan/SpeechMOS:v1.0.0", "utmos22_strong", trust_repo=True)
score = predictor(wave, sr)
# tensor([3.7730]), good quality speech!
```

## Demo
Expand All @@ -20,9 +20,9 @@ import torch
import librosa

wave, sr = librosa.load("<your_audio>.wav", sr=None, mono=True)
predictor = torch.hub.load("tarepan/SpeechMOS:main", "utmos22_strong", trust_repo=True)
predictor = torch.hub.load("tarepan/SpeechMOS:v1.0.0", "utmos22_strong", trust_repo=True)
score = predictor(torch.from_numpy(wave).unsqueeze(0), sr)
#
# tensor([3.7730])
```

## How to Use
Expand All @@ -31,21 +31,22 @@ SpeechMOS use `torch.hub` built-in model loader, so no needs of library import

First, instantiate a MOS predictor with model specifier string:
```python
predictor = torch.hub.load("tarepan/SpeechMOS:main", "<model_specifier>", trust_repo=True)
import torch
predictor = torch.hub.load("tarepan/SpeechMOS:v1.0.0", "<model_specifier>", trust_repo=True)
```

Then, pass tensor of speeches :: `(Batch, Time)`:
```python
waves_tensor = torch.rand((2, 16000)) # Two speeches, each 1 sec (sr=16,000)
score = predictor(waves_tensor, sr=16000)
#
# tensor([2.0321, 2.0943])
```

Returned scores :: `(Batch,)` are each speech's predicted MOS.
If you hope MOS average over speeches (e.g. for TTS model evaluation), just average them:
```python
average_score = score.mean().item()
#
# 2.0632
```

## Predictors
Expand Down
2 changes: 1 addition & 1 deletion hubconf.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@


URLS = {
"utmos22_strong": "https://github.com/tarepan/SpeechMOS/releases/download/v0.0.0/utmos22_strong_step7459.pt",
"utmos22_strong": "https://github.com/tarepan/SpeechMOS/releases/download/v1.0.0/utmos22_strong_step7459_v1.pt",
}
# [Origin]
# "utmos22_strong" is derived from official sarulab-speech/UTMOS22 'UTMOS strong learner' checkpoint, under MIT lisence (Copyright 2022 Saruwatari&Koyama laboratory, The University of Tokyo, https://github.com/sarulab-speech/UTMOS22/blob/master/LICENSE).
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "speechmos"
version = "0.0.1"
version = "1.0.0"
description = "Easy-to-Use Speech MOS predictors 🎧"
authors = ["tarepan"]
readme = "README.md"
Expand Down

0 comments on commit 272eb8f

Please sign in to comment.