Add lip sync pipeline [50 LPT] #35

rickstaa · 2024-07-12T10:02:57Z

As outlined in our treasury proposal during phase 2 of the AI SPE roadmap, we are collaborating closely with seven startups. These startups, who share our dedication to decentralized AI, serve as design partners, providing valuable feedback to enhance the AI subnet user experience and prepare it for onboarding more mature scale-ups and applications.

One of these startups aims to enable users to bring their static artistic photos to life using AI. To support this, the AI SPE team is working on integrating a new LipSync pipeline into the AI subnet. This pipeline will allow users to provide audio that will be lip-synced to their images.

We are calling on the community to lead the first step of this integration by researching this new pipeline and creating a proof of concept (POC) in the AI worker 🔧. Once this stage is complete, we can begin integrating the pipeline into the go-livepeer subnet and subsequently implement a text-to-speech pipeline to provide the full experience the startup is seeking 🚀 .

Bounty Requirements

To successfully complete this bounty, the participant should:

Create a brief report comparing several LipSync models and provide a recommendation on which model to implement first, including a rationale for the choice. The report should be concise, demonstrating a well-informed decision.
Implement a working /lipsync route and pipeline in the AI worker repository, making this capability available on port 8005. This pipeline should accept an audio file and a photo, and provide the user with a talking avatar. While we may expand to support video files in the future, we will start with photos to create talking heads.

This bounty does NOT include:

The full end-to-end implementation of this pipeline on the go-livepeer side, including the payment logic and job routing. This will be tackled either by the AI SPE team or in a subsequent bounty.

Required skilset

Bounty Level: Intermediate

Experience in understanding and interpreting generative AI research papers.
Proficiency in implementing generative AI models in Python using pre-trained weights.
Familiarity with FastAPI.
Strong Python programming skills.

Implementation Tips

In this section you will find some tips to get you started but since this bounty is more involved you will have a direct access to the engineering team to ask questions about code that is unclear or implementation decisions.

Lipsync Pipeline example

You can see the proposed lipsync (or talking heads) pipeline in action by going to:

Example Lipsync Models

A quick search provided us with the following example LipSync models that could be used:

However, feel free to suggest any LipSync model you deem fit.

Tips for creating the AI Worker Pipeline

To understand how to create a new AI worker pipeline, you can refer to recent pull requests where new pipelines were added:

The steps to add a new pipeline on the AI worker side are relatively straightforward:

Add the model weights download command for your chosen model to the dl_models.sh file.
If needed add the requirements for running the model inference in the AI runner docker file.
Copy the image-to-video route, and implement the required request and response types. Since the LipSync pipeline is also expected to return frames that can be transcoded into an MP4, much of the image-to-video code can be reused.
Copy the image-to-video pipeline, and replace the logic in the __init__ method to load the chosen LipSync model on the GPU.
Replace the logic in the __call__ method with the actual LipSync inference logic.
Ensure your new lipsync route is attached to the FastAPI server in the main.py file.
If everything was done correctly, you can start the FastAPI server, specify the new pipeline and model, and interact with your new pipeline at the /docs path on port 8005.

The ai-worker repository contains a development guide that can help you get started debuging your changes.

How to Apply

Express Your Interest: Comment on this issue to indicate your interest and explain why you're the ideal candidate for the task.
Wait for Review: Our team will review expressions of interest and select the best candidate.
Get Assigned: If selected, we'll assign the GitHub issue to you.
Start Working: Dive into your task! If you need assistance or guidance, comment on the issue or join the discussions in the #🛋│developer-lounge channel on our Discord server.
Submit Your Work: Create a pull request in the relevant repository and request a review.
Notify Us: Comment on this GitHub issue when your pull request is ready for review.
Receive Your Bounty: We'll arrange the bounty payment once your pull request is approved.
Gain Recognition: Your valuable contributions will be showcased in our project's changelog.

Thank you for your interest in contributing to our project 💛!

Warning

Please wait for the issue to be assigned to you before starting work. To prevent duplication of effort, submissions for unassigned issues will not be accepted.

The text was updated successfully, but these errors were encountered:

rickstaa · 2024-07-12T10:07:44Z

This bounty was assigned to @pschroedl and was completed last week. It appears the bounty was not initially posted on this repository, so I have posted it retroactively for visibility.

rickstaa · 2024-07-12T10:08:24Z

Submission is found here and will be reviewed in the coming week.

rickstaa · 2024-08-16T08:40:25Z

This was implemented in livepeer/ai-worker#120 and has been paid out on chain already 🎉. All bounty transactions can be found back on the AI SPE wallet.

rickstaa added the AI AI SPE bounties label Jul 12, 2024

rickstaa added the bounty Software bounies. label Jul 12, 2024

rickstaa closed this as completed Aug 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add lip sync pipeline [50 LPT] #35

Add lip sync pipeline [50 LPT] #35

rickstaa commented Jul 12, 2024 •

edited

Loading

rickstaa commented Jul 12, 2024

rickstaa commented Jul 12, 2024

rickstaa commented Aug 16, 2024

Add lip sync pipeline [50 LPT] #35

Add lip sync pipeline [50 LPT] #35

Comments

rickstaa commented Jul 12, 2024 • edited Loading

Bounty Requirements

Required skilset

Implementation Tips

Lipsync Pipeline example

Example Lipsync Models

Tips for creating the AI Worker Pipeline

How to Apply

rickstaa commented Jul 12, 2024

rickstaa commented Jul 12, 2024

rickstaa commented Aug 16, 2024

rickstaa commented Jul 12, 2024 •

edited

Loading