Skip to content

Commit

Permalink
wip
Browse files Browse the repository at this point in the history
  • Loading branch information
ks6088ts committed Oct 8, 2024
1 parent dbd2247 commit ea681ca
Show file tree
Hide file tree
Showing 4 changed files with 113 additions and 7 deletions.
3 changes: 3 additions & 0 deletions apps/16_whisper_transcription/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# References

- [openai/whisper](https://github.com/openai/whisper)
21 changes: 21 additions & 0 deletions apps/16_whisper_transcription/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
import whisper

model = whisper.load_model("turbo")

# load audio and pad/trim it to fit 30 seconds
audio = whisper.load_audio("apps/16_whisper_transcription/sample_audio.wav")
audio = whisper.pad_or_trim(audio)

# make log-Mel spectrogram and move to the same device as the model
mel = whisper.log_mel_spectrogram(audio).to(model.device)

# detect the spoken language
_, probs = model.detect_language(mel)
print(f"Detected language: {max(probs, key=probs.get)}")

# decode the audio
options = whisper.DecodingOptions()
result = whisper.decode(model, mel, options)

# print the recognized text
print(result.text)
95 changes: 88 additions & 7 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ lxml = "^5.3.0"
nest-asyncio = "^1.6.0"
typer = "^0.12.5"
azure-cognitiveservices-speech = "^1.40.0"
openai-whisper = "^20240930"

[tool.poetry.group.dev.dependencies]
pre-commit = "^4.0.0"
Expand Down

0 comments on commit ea681ca

Please sign in to comment.