OpenWillis v2.2
Release date: Wednesday August 14th, 2024
Version 2.2 introduces new capabilities to improve speaker labeling during speech transcription. It also introduces new features for preprocessing videos with multiple faces to better support downstream measurement of facial and emotional expressivity.
If you have feedback or questions, please bring it up in the Discussions tab.
Contributors
GeorgeEfstathiadis
vjbytes102
Kcmcveigh
maworthington
WillisDiarize v1.0
A new function was added for correcting speaker labeling errors after speech transcription. This function takes the JSON file of a transcript as input, passes it through an ensemble LLM model, and outputs the corrected JSON file.
WillisDiarize AWS v1.0
WillisDiarize AWS performs the same task as the previous function. However, it is best suited for users that are operating within their own EC2 instance. This function assumes the user has already deployed the WillisDiarize model as a SageMaker endpoint (see Getting Started page for instructions).
Speech transcription with AWS v1.2 / Speech transcription with Whisper v1.2
Updated to add the option of implementing the WillisDiarize functions to correct speaker labeling errors prior to creating the JSON output file.
Speech characteristics v3.1
Added functionality to only compute sets of speech coherence variables if desired, to avoid unnecessary computational burden, using the option parameter.
Vocal acoustics v2.1
Updated to include the option to calculate framewise summary statistics only for voiced segments longer than 100ms.
Video preprocessing for faces v1.0
This function adds preprocessing capabilities for video files containing the face of more than one individual. For contexts such as video calls and recordings of clinic visits, this function detects unique faces; the output can be used to apply the facial_expressivity and emotional_expressivity functionings to a unique face in a video
Video cropping v1.0
This function, designed to be used in conjunction with preprocess_face_video , allows the user to adjust parameters related to cropping and trimming videos to extract frames for each unique face.
General updates
Updated Pyannote from 3.0.0 to 3.1.1 to match WhisperX dependencies.