Skip to main content
Skip table of contents

Wake Word Widget

A Wake Word is a voice-triggered technology that allows devices to listen passively in low-power mode until a specific phrase activates them, ensuring efficient and private hands-free interaction.

Widget Navigation

Let’s start by identifying the widget's main options.

Wake Word 1.png

Navigate between your models

  • Navigate back (1) to the Project Hub

  • Change the Selected Model (2) you are editing

  • The widget global editing tools (3):

    • Add this model to Favorites

    • Create a new model

    • Rename the model

    • Configure the recognizer

    • Delete the model

  • Change the Model’s Language (4)

  • Quick Test (5) your model in real time by speaking within the widget

Wake Word Editor

Wake Word 2.png

Wake Words now have their own editor!

  • Enter a Wake Word or Anti Wake Word in the dedicated input fields (1), then press Enter to create a selectable Word (2) by clicking on its row.

  • Edit a Wake Word using the Rename Wake Word buttons (3) or remove it with the Delete Wake Word buttons (4).

  • When a Wake Word is selected, you can optionally specify one or more pronunciation overrides using the add pronunciation button (5).

  • Add optional phonemes for a selected Wake Word to resolve recognizer ambiguities or override recognition. Use the add phoneme button (6) or our GenAI tool to generate phonemes based on provided context.

  • Use the dedicated buttons to edit (8) or delete (9) phonemes or pronunciations.

Please note that you cannot use pronounciations and phonemes at the same time.

Why using phonemes ?

Why using phonemes ?

Explicit phonemes are useful (and sometimes necessary) in the following scenarios:

  1. Proper Names & Specialized Jargon:

    • Brand names: "Vivoka" might be guessed as "Vi-vo-ka" or "Vai-vo-ka". Explicit phonemes ensure consistency.

    • Acronyms: "SaaS" (/sæs/) vs "SAAS" (/ɛs eɪ eɪ ɛs/).

    • Medical/Technical terms: Complex words that do not follow standard pronunciation rules.

  2. Ambiguous Pronunciations:

    • Homographs: "Lead" (to guide) vs "Lead" (the metal). Phonemes allow you to disambiguate pronunciation depending on context.

    • Foreign words: Using a French word inside an English grammar (e.g., "croissant"). An English recognizer might mangle it; explicit phonemes let you map it to the closest English sounds.

  3. Accent Adaptation:

    • If you want "tomato" to strictly match /təˈmeɪtoʊ/ (US) or /təˈmɑːtəʊ/ (UK) regardless of the speaker’s accent, you can enforce it (though usually it's better to let the acoustic model handle this).

Summary: You usually do not need phonemes for standard dictionary words. You do need them for made-up words (like "Leef"), names, acronyms, or whenever testing shows that the recognizer mishears a specific command.

Testing The Wake Word Model

Wake Word 3.png

The test panel

When the test panel is open and ready, click Start recording (1) to run the model for one minute. Adjust the confidence threshold (2) in real time during the test to display only hypotheses that meet the minimum confidence score. Speak the wake word or anti wake word aloud to see the corresponding hypothesis appear (3), depending on your model's recognizer parameters.

How to read hypothesis ?

How to read hypothesis ?

image-20251210-134006.png

Zoom on hypothesis

The detected commands are Wake Words, marked with the POSITIVE tag, which is automatically added to each Wake Word. Anti Wake Words bear the NEGATIVE tag. A confidence score of over 6000, exceeding the 4000 threshold, allows the hypothesis to be displayed. Alternative hypotheses contain the same information as the main hypothesis.

What are recognizer options ?

What are recognizer options ?

image-20251210-141453.png

Wake Word Recognizer parameters

Based on this documentation page containing Recognizer parameters, we provide the most used parameters that have a real impact in Quick Testing. Remember you can still change those parameters after Downloading the project.

TSILENCE: The silence period required to detect the Wake Word has elapsed. Any voice recognized afterward will be treated as a new Wake Word.

STREAM_RESULT_MODE: This means results will appear while you speak. Even if you do not pass the TSILENCE, "partial" results will still appear.

MAXNBEST: This value represents the maximum number of hypotheses considered and displayed in the test.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.