Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio transcription with SOME #911

Merged
merged 8 commits into from
Nov 25, 2023
Merged

Audio transcription with SOME #911

merged 8 commits into from
Nov 25, 2023

Conversation

oxygen-dioxide
Copy link
Contributor

@oxygen-dioxide oxygen-dioxide commented Oct 29, 2023

This PR adds integration with SOME, an audio to musical notes converter. Users can right-click on a wave part and select "Transcribe" to convert it into a voice part.
image

image

Before using, please download the SOME model from https://github.com/xunmengshe/OpenUtau/releases/download/0.0.0.0/some-0.0.1.oudep

@wolfgitpr
Copy link
Contributor

@maiko3tattun May I ask if the loading dialog can be used here.

@maiko3tattun
Copy link
Contributor

maiko3tattun commented Oct 29, 2023

@maiko3tattun May I ask if the loading dialog can be used here.

Perhaps it could be used.
However, MessageBox.ShowModal() might be better than LoadingNotification(#892). It is used to create FRQs.

@yqzhishen
Copy link
Contributor

@oxygen-dioxide Maybe we should add checks to prevent very long chunks after slicing, for example, if someone clicks "transcribe" on an accompaniment audio that has no silence parts at all. The model contains attention operators and can cause OOM on very long sequences.

//msgbox?.SetText($"{text} {part.name}\n{wavPosS}/{wavDurS}");
msgbox.SetText(string.Format("{0} {1}\n{2} / {3}",text, part.name, wavPosS, wavDurS));
});
transcribeTask.Wait();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of Wait() use ContinueWith(). See RegenFrq() for example. Modify project and show/hide dialog on scheduler from TaskScheduler.FromCurrentSynchronizationContext().

@@ -8,6 +8,8 @@
<system:String x:Key="context.part.delete">Delete part</system:String>
<system:String x:Key="context.part.rename">Rename part</system:String>
<system:String x:Key="context.part.replaceaudio">Reselect audio file</system:String>
<system:String x:Key="context.part.transcribe">Transcribe</system:String>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though transcribe is correct here, a more specific description like "Transcribe audio to create a note part" could be helpful.

@oxygen-dioxide oxygen-dioxide changed the title [Help wanted] Audio transcription with SOME Audio transcription with SOME Nov 18, 2023
@stakira stakira merged commit 6912603 into stakira:master Nov 25, 2023
1 check was pending
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants