Wake Word Unit Testing

Introduction

Wake Word Unit Testing is a crucial process for validating the reliability of voice recognition systems in real-world scenarios. At its core, this approach involves batch-comparing audio file transcriptions against the recognized wake words defined by your model.

Ensuring Accuracy in Real-World Environments

The objective is clear: to verify that your Wake Word model can accurately interpret user or employee inputs, even under conditions that mimic production environments. This ensures that the system performs as expected when deployed, minimizing errors and maximizing efficiency.

The key to effective Wake Word Unit Testing lies in the quality and relevance of the audio data used. You’re not just testing with any audio—you’re using production-quality recordings that reflect the actual conditions in which the system will operate.

This means using the same microphone setups, capturing specific voice accents, and even simulating noisy environments where commands may be issued. By doing so, you create a robust testing framework that accounts for variables like background noise, speech patterns, and hardware differences, all of which can impact the accuracy of voice recognition.

Beyond technical validation, this method bridges the gap between controlled lab testing and the unpredictable nature of live environments, ultimately leading to a more seamless and trustworthy development process.

How to proceed with unit testing?

Access unit tests through the dedicated button in the Wake Words widget. note that you must have access to the Wake Words technology to view this page.

Unit Test 72.png — Unit Testing Wake Word models: **Available Unit Tests**

You can find the New Unit Tests button (2) to start creating a Unit Test.

Test the Selected Model (1a) using the Run the unit test using selected model button (3). This creates a pending process and a Background Task. See the Background Tasks documentation for details.

You can attach an optional Speech Enhancement Model using the Speech Enhancement select (1b). This applies audio modifiers to each test file, enabling use of noisy files that reflect real work situations. See the Speech Enhancement Widget for details about how to setup a Speech Enhancement model.

The Edit this unit test button (4) allows you to fix transcriptions, add or remove tests, or adjust the confidence threshold. You can Remove Tests (5) without affecting result history. Test results remain independent of the Unit Tests used.

In order to verify that everything is setup correctly inside the test, you can use the Verify this unit test using selected model button (6). It will attempt to verify that every audio transcription within the test is a possible voice command in your selected model’s grammar.

In this example, several audio files does not match model possibilities

Grammar Verification does verify that the grammar contains a solution to match audio transcriptions of the Unit Test. If no solution can be found, the result is Not In Grammar. Partial means that the transcription can be included in a solution, but not with exact match. When a exact solution is found, the grammar match the audio transcription, result is In Grammar and the test is ready to be executed.

Unit Test 73.png — Unit Testing Wake Words models: **Existing Results of selected model**

Click Refresh Results (2) once the task is complete to display the Test Conclusion and enable the See results in details button (3), even if the test failed. Test failures are valuable feedback, and guidance on interpreting them is provided in the second part of this documentation: Understand Results.

You can Remove tests results (4) from the history if desired. This action does not affect your tests, models, or another project data. Results can be filtered using the Test Conclusion filter (5).

How to create a Unit Test ?

After completing the Create Unit Test Form, you access the page to edit the test content. While browsing your Audio Asset Library, ensure every file you add to the test is transcribed.

Unit Test 2.png — Edit transcribed Audio Files in the Test

Your Unit Testss need at least one Audio Asset, which you can obtain by using record or import (1) features. Refer to our decicated documentation for more details.

Your Audio Asset must be transcribed before it can be used in a Unit Test. Hover over the table row to reveal the transcription button (3). Transcriptions should contain only one sentence and no punctuation.

You can then add your files to the test using the Add to test button (4a) or select specific files (2) and add them all at once with the Add selected files button (4b). If you organized your audio asset files within folders, you can add directly a folder to your test using the Add Folder to Test button (4c). Remember that any files in this folder must be transcribed to be use in the test.

Once your files have been added to the test, you can proceed to continue editing your tests by playing the audio (1) in order to verify it match exactly matches the transcription (2).

Understand Results

When a Test leaves the pending state, it becomes either a success or a failure. A success means all Unit Tests inside it pass. If any Unit Test fails, the entire Test fails.

Below are the details of this tests to help you understand Unit Tests Results in depth:

This page displays the Unit Test Results, which include:

Test parameters at the top of the screen:
- Case-sensitive comparison setting
- Confidence threshold
- Date and time when the test execution completed
The first tab, featuring a table for each individual test:
- A comparison of the Expected Result and the Test Result
- A Confidence Score associated with the Test Result, indicating the recognizer’s confidence that the Test Result matches what was actually said
- A Conclusion for each test. Fail or Success, depending of the comparison strategy.
- A Grammar Match indication. It helps understand why the test is failing.

Grammar Match : The Test includes a verification that the grammar contains a solution to match expected result ot the Test. If no solution can be found, the result is Not In Grammar. Partial means that the transcription can be included in a solution, but not with exact match. When a exact solution is found, the grammar match the audio transcription, result is In Grammar and the test conclusion is up to the recognizer to do the job.

Case-Sensitive Comparison

The voice command recognizer returns the most probable result from the grammar options. This means it can only answer with commands defined in the grammar. Consequently, unit tests will strictly compare audio transcriptions to the recognized commands, including case and orthography.

This strict comparison can sometimes cause a test to fail for the wrong reason: the command may be recognized correctly, but a poorly written transcription can trigger a failure.