Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wakeword accuracy issue #207

Open
sangheonEN opened this issue Oct 8, 2024 · 0 comments
Open

wakeword accuracy issue #207

sangheonEN opened this issue Oct 8, 2024 · 0 comments

Comments

@sangheonEN
Copy link

sangheonEN commented Oct 8, 2024

I created a wake word model and applied it to STT, but I ran into a problem with the prediction accuracy output range.
thomas_oww.zip

I am Korean, so the test voice is a Korean male pronunciation.

When I created a model for a specific wake word using the model generation code provided in the colab and applied it, the model prediction output value came out in the 0.00x value range, so I couldn't set an appropriate wake threshold value. Has anyone had a similar experience?

Currently, when I experimented with it, it was just silent, and the oww prediction score value was output from 0.0008 to 0.00099.

When I shouted the wake word, it went up to 0.0015.

But If you make a loud noise without using a wake word, it went up to 0.002.

So it seems like it's not usable yet, because the oww performance isn't good.

That's the conclusion.

Is it because it only supports English?

print(f"idx = {idx}, scores[-1] = {scores[-1]}") This is the output value under the code.

def _process_wakeword(self, data):
    """
    Processes audio data to detect wake words.
    """
    if self.wakeword_backend in {'pvp', 'pvporcupine'}:
        pcm = struct.unpack_from(
            "h" * self.buffer_size,
            data
        )
        porcupine_index = self.porcupine.process(pcm)
        if self.debug_mode:
            print (f"wake words porcupine_index: {porcupine_index}")
        return self.porcupine.process(pcm)

    elif self.wakeword_backend in {'oww', 'openwakeword', 'openwakewords'}:
        pcm = np.frombuffer(data, dtype=np.int16)
        prediction = self.owwModel.predict(pcm)
        # print(f"prediction = {prediction}\n")
        max_score = -1
        max_index = -1
        wake_words_in_prediction = len(self.owwModel.prediction_buffer.keys())
        self.wake_words_sensitivities
        if wake_words_in_prediction:
            for idx, mdl in enumerate(self.owwModel.prediction_buffer.keys()):
                scores = list(self.owwModel.prediction_buffer[mdl])
                # print(f"idx = {idx}, scores[-1] = {scores[-1]}")
                if scores[-1] >= self.wake_words_sensitivity and scores[-1] > max_score:
                    max_score = scores[-1]
                    max_index = idx
            if self.debug_mode:
                print (f"wake words oww max_index, max_score: {max_index} {max_score}")
            # print(f"max_index = {max_index}\n")
            return max_index  
        else:
            if self.debug_mode:
                print (f"wake words oww_index: -1")
            return -1

    if self.debug_mode:        
        print("wake words no match")
    return -1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant