HeyGenClone

Welcome to HeyGenClone, an open-source analogue of the HeyGen system.

I am a developer from Moscow 🇷🇺 who devotes his free time to studying new technologies. The project is in an active development phase, but I hope it will help you achieve your goals!

Currently, translation support is enabled only from English 🇬🇧!

Installation 🥸

Clone this repo
Install conda
Create environment and install requirements:
```
cd path_to_project
sh install/env.sh
```
In config.json file change HF_TOKEN argument. It is your HuggingFace token. Visit speaker-diarization, segmentation and accept user conditions
Download weights from drive, unzip downloaded file into weights folder
Install ffmpeg

Configurations (config.json) 🧙‍♂️

Key	Description	Can modify
LANGUAGES_URL	Url for getting available languages	❌
DET_TRESH	Face detection treshtold [0.0:1.0]	✅
DIST_TRESH	Face embeddings distance treshtold [0.0:1.0]	✅
DB_NAME	Name of the database for data storage	✅
HF_TOKEN	Your HuggingFace token (see Installation)	✅

Usage 🤩

  conda activate heygenclone
  cd path_to_project

At the root of the project there is a translate script that translates the video you set.

video_filename - the filename of your input video (.mp4)
output_language - the code of the language to be translated into (you can find it here)
output_filename - the filename of output video (.mp4)

python translate.py video_filename output_language -o output_filename

I also added a script to overlay the voice on the video with lip sync, which allows you to create a video with a person pronouncing your speech. Сurrently it works for videos with one person.

voice_filename - the filename of your speech (.wav)
video_filename - the filename of your input video (.mp4)
output_filename - the filename of output video (.mp4)

python speech_changer.py voice_filename video_filename -o output_filename

How it works 😱

Detecting scenes (PySceneDetect)
Face detection (yolov8-face)
Reidentification (deepface)
Speech enhancement (MDXNet)
Speakers transcriptions and diarization (whisperX)
Text translation (googletrans)
Voice cloning (TTS)
Lip sync (lipsync)
Face restoration (GFPGAN)
[Need to fix] Search for talking faces, determining what this person is saying

Translation results 🥺

Note that this example was created without GFPGAN usage!

Destination language	Source video	Output video
🇷🇺 (Russian)

Contributing 🫵🏻

Contributions are welcomed! I am very glad that so many people are interested in my project. I will be happy to see the pull requests. In the future, all contributors will be included in the list that will be displayed here!

To-Do List 🤷🏼‍♂️

Fully GPU support
Multithreading support (optimizations)
Detecting talking faces (improvement)

Other 🤘🏻

Tested on macOS
⚠️ The project is under development!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HeyGenClone

Installation 🥸

Configurations (config.json) 🧙‍♂️

Usage 🤩

How it works 😱

Translation results 🥺

Contributing 🫵🏻

To-Do List 🤷🏼‍♂️

Other 🤘🏻

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
core		core
install		install
weights		weights
.gitignore		.gitignore
README.md		README.md
config.json		config.json
speech_changer.py		speech_changer.py
translate.py		translate.py

2lambda123/HeyGenClone

Folders and files

Latest commit

History

Repository files navigation

HeyGenClone

Installation 🥸

Configurations (config.json) 🧙‍♂️

Usage 🤩

How it works 😱

Translation results 🥺

Contributing 🫵🏻

To-Do List 🤷🏼‍♂️

Other 🤘🏻

About

Resources

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages