For a story that starts with a classifier, see data/samples/audio_story_classifier_start.json
For a story that starts with an audio file, see data/samples/audio_story_audio_start.json
Story that starts with a classifier:
{
"nodes": {
"0": {"type": "classifier", "labels": {"indoors": 1, "outdoors": 2}, "thresholds": {"indoors": 0.5,"outdoors":0.5}},
"1": {"type": "audio", "next": 3, "audio_file": "audio/indoors.mp3"},
"2": {"type": "audio", "next": 3, "audio_file": "audio/outdoors.mp3"},
"3": {"type": "audio", "next": null, "audio_file": "audio/changes.mp3"}
}
}
Story that starts with an audio file:
{
"nodes": {
"0": {"type": "audio", "next": 1, "audio_file": "audio/changes.mp3"}
"1": {"type": "classifier", "labels": {"indoors": 2, "outdoors": 3}, "thresholds": {"indoors": 0.5,"outdoors":0.5}},
"2": {"type": "audio", "next": 4, "audio_file": "audio/indoors.mp3"},
"3": {"type": "audio", "next": 4, "audio_file": "audio/outdoors.mp3"},
"4": {"type": "audio", "next": null, "audio_file": "audio/changes.mp3"}
}
}
A Flask server provides access to the CLIP model.
Make a conda environment and install the dependencies.
conda create --name "audio-stories" python=3.8
conda activate audio-stories
pip install -r server/requirements.txt
Here is the command to run the Flask server:
./server/run.sh
This script automatically attempts to set Google TTS credentials, then starts the server.
Run ./server/run.sh --help
to see optional args.
There are unit tests for the audio saving/loading functionality and integration tests for the API. Run them like this:
python3 -m pytest server/test
server/index.py
: code to run the server, very bare-bonesserver/create_app.py
: sets up the app, includes API endpointsserver/model_utils.py
: code for using the CLIP modelserver/save_audio_files.py
: code for storing and loading audio stories on the serverserver/speech_generator.py
: code for doing text-to-speechserver/test
: tests