Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio Transcript heuristic for dynamic thread allocation on client #2023

Open
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

gfd2020
Copy link
Collaborator

@gfd2020 gfd2020 commented Dec 11, 2023

This PR implements a heuristic that always tries to keep the remote audio transcription server busy while not leaving the client idle. I believe that this heuristic should only be turned on when the remote transcription server is slow. For fast servers, it might be better to leave it turned off.

It is based on 3 principles:

  1. Dynamically adjust the number of remote transcription threads based on server response speed.

  2. Rearrange the audio items so that they are spread across the queue, this will help the client to always have some processing instead of standing still or only carrying out the work at the end.

  3. The client will also help the server with the transcription task, only if the client has no other tasks to do.

This heuristic must be configured in the 'AudioTranscriptConfig.txt' configuration file:


#Performs a heuristic for dynamic thread allocation and spaced requeue. Helps improve performance of slow transcription servers.
clientDynamicThreadRequeueHeuristics = true

#If active, the client will also help the server with the transcription task, only if the client has no other tasks to do. The heuristic must be turned on
clientTranscriptHelp = true

#Defines the implementation class for client help, must be a local implementation ( not remote transcript task )
clientTranscriptHelpImplementationClass = iped.engine.task.transcript.Wav2Vec2TranscriptTask

#Advanced Parameter. Defines which part of the queue the items will be sent to. 4 = 1/4 size. Values ​​greater than or equal to 1
clientSplitQueueRatio = 4

#Advanced Parameter. Sets the delta time in milliseconds when consecutive items are requested to be requeued, provides better spacing.
clientRequeueDeltaTime = 5000


To test the PR, the parameters must be uncommented, by default they are turned off.
Audio transcription must be turned on and configured as usual.

Teste Cases: Any UFDR report with multiple processing items in addition to audio to be transcribed.

@lfcnassif
Copy link
Member

Thank you @gfd2020! I think I'll have time to review this just when I return back from vacation next year, in the second half of January, if no other dev reviews it before me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants