Voice Error Correction (VEC)

This page of documentation describes how to use the VEC (Voice Error Correction) module of Vivoka.

Automatic Speech Recognition (ASR) engines often struggle with alphanumeric input such as serial codes. Even modern models confuse similar-sounding letters and digits ("b" vs "d", "m" vs "n", "o" vs "0" in English for example). These errors can frustrate users and make ASR unusable for tasks where a single wrong character invalidates the entire sequence.

The VEC (Voice Error Correction) module addresses this problem by acting as a post-processor on top of the ASR system. It analyzes the ASR output, applies targeted corrections, and delivers more reliable results without requiring changes to application code. Importantly, VEC is structure-preserving: the JSON schema of the ASR result remains identical, only values are adjusted.

VEC currently supports two operating modes:

Alphanumeric sequences (available today, documented in this page)
Free-speech (planned for future release)

Because the module is language-dependent, it must be paired with the correct acoustic model and lexicon on the ASR side to function properly.

System Requirements

To function correctly, VEC depends on the following conditions:

Language resources
VEC is language-dependent. It must run with the matching acoustic model and lexicon used by the ASR engine. Using mismatched resources will reduce or eliminate its effectiveness.
ASR output requirements
- Works with at least 10-best output. You can find how to do that in the Configuration changes section.
Runtime compatibility
VEC runs in the same pipeline as the ASR engine and has no additional platform dependencies beyond those already required for the ASR.
Performance
- Latency: typically under 10 ms per recognition result.
- Memory: stable even with large context lists (up to several hundred thousand entries).
- Throughput: scales linearly with ASR output; no bottleneck in real-time use.

How to enable it?

To activate VEC in your project, three things must be done:

Place the VEC add-on library in the same directory as the other VSDK libraries.
Update the grammar, so the system knows which parts can be corrected by VEC.
Update the configuration, so the runtime knows which post-processor, models and context list (if any) to use.

For VDK-service, the step 1 and 3 are done automatically when downloading a project through the VDK-Studio. You just have to ensure that the language is supported and that you have updated the grammar to include a VEC tag. (Grammar changes).

Install the add-on library

VEC is delivered as a shared library. It must be placed next to the other ASR libraries (e.g. in the same lib/ or bin/ directory as the existing recognizer components).

If the library is missing or misplaced, the recognizer will fail to start.
No additional environment variables are required if the library is in the standard location.

Grammar changes

VEC does not apply globally — you must mark the grammar regions that can be corrected by VEC. This prevents commands such as help or yes from being altered.

To do this, wrap the alphanumeric span with !tag(vec, …) and limit its length with !repeat (1–7 characters, the range supported by VEC. See Limitations). Note that this range can be smaller but not larger. If you need a larger range, feel free to contact us.

Example before (without VEC):

<main>: <commands> | ["ok"] !repeat(<alphanum>, 1, 7);

Example after (VEC-enabled):

<main>: <commands> | ["ok"] !tag(vec, !repeat(<alphanum>, 1, 7));

In case you have more than 1 alphanumeric input in your grammar then you have to tag them with different name. The only important requirement regarding the name of the tag is that is starts with vec but can otherwise be anything like vec-1, vec-2 or vec-postprocess. They just have to be unique.

On top of this required change, it is highly recommended that you remove any custom pronunciation on alphanumeric sequences that you may have added in your grammar using the !pronounce directive. This is because the VEC model has been trained with the default pronunciation and adding a custom one will impact its performance.

Configuration changes

On the configuration side, add a post_processing section to the recognizer config. This tells the runtime to activate VEC and load its model. It is also possible to specify an initial context and accent if needed.

In addition to this new section in the configuration, you must also update the model settings used by the recognizer. For this set the parameter LH_SEARCH_PARAM_MAXNBEST to at least 10.

This controls how many alternative hypotheses the ASR engine produces for each recognition.

VEC uses these multiple hypotheses to improve correction accuracy — a value of 10 has shown the best performance in internal benchmarks, but you may adjust it as needed.

JSON

{
  "version": "2.0",
  "csdk": {
    "paths": {
      "data_root": "../data"
    },
    "asr": {
      "recognizers": {
        "rec": {
          "acmods": ["am_enu_vocon_car_202312090302.dat"],
          "post_processing": {
            "type": "vec",
            "model": "VEC-250901-eng-US.vec",
            "context": "len6_100k.txt",
            "accent": "eni"
          }
        }
      },
      "models": {
        "grm": {
          "type": "static",
          "file": "alphanum.fcf",
          "acmod": "am_enu_vocon_car_202312090302.dat",
          "lexicon": {
            "clc": "clc_enu_cfg3_v14_8_000000.dat"
          },
          "settings": {
            "LH_SEARCH_PARAM_MAXNBEST": 10
          }
        }
      }
    }
  }
}

type — must be "vec" to activate VEC.
model — (Optional) path to the VEC correction model (.vec). Relative path are relative to the configuration file path. If not provided VEC will work only with the context provided.
context — (Optional) path to a context list file containing valid sequences. Large files (hundreds of thousands of entries) are supported with minimal overhead. More on this in the Context list section.
accent — (Optional) default accent that will be use in the model. This value is set to unknown by default and can be changed at any point at runtime using the invoke interface.

Context list

One of VEC’s most effective features is its ability to use a context list: a predefined set of valid alphanumeric sequences.

Why Context Helps

ASR errors are often phonetically plausible but semantically invalid. By checking results against a context list, VEC can prefer corrections that yield sequences known to be valid in your application (e.g. product codes, customer IDs).

Format

Flat list of strings
Each string is a space separated alphanumeric sequence (lowercase letters and digits only)

6 i 
j y 
v 7 
0 c 0 
s h z 
8 1 8
5 j g
e c 8 8
7 5 k p 3 a
1 l k 0 z w

Size & Performance

Supports hundreds of thousands of entries without significant memory growth or latency impact
Lookup is optimized for large sets
Works equally well with small lists (dozens of entries)

Behavior

If the ASR output matches a sequence in the list → passed through unchanged
If the ASR output is close to one or more entries → VEC corrects towards the best candidate
If a non-alphanumeric value is inserted in the context then an exception will be thrown.

Changing Context / Accent at Runtime

While VEC can load a static context list from configuration, applications may also modify the context or accent at runtime using the recognizer’s post-processor invoke interface.

This makes it possible to adapt dynamically to user input, environment changes, or session-specific requirements without restarting the recognizer.

API Usage

The invoke method accepts two arguments:

C++

auto rec = engine->recognizer("rec");
auto pp  = rec->postProcessor();

pp->invoke(command, params);

Android

Java

Recognizer rec = com.vivoka.vsdk.asr.csdk.Engine.getInstance().getRecognizer("rec", recognizerListener);
IEntryPoint pp = rec.postProcessor();

pp.invoke(command, params);

command: the operation to perform (string). The full list of possible action for this addon is given in the next section. If an unknown command is given, an exception will be thrown.
params: a JSON object which structure depends on the command.

Supported Commands

1. `set-context`

Replaces the current runtime context with the provided list of sequences.

Parameters: JSON array of strings.
Return: none.

Example:

C++

pp->invoke("set-context", {{ "context", {"a 1 3", "b 1 4"} }});

Android

Java

JSONObject params = new JSONObject();
params.put("context", com.vivoka.vsdk.util.JsonUtils.makeJsonArray("a 1 3", "b 1 4"));
pp.invoke("set-context", params);

2. `add-context`

Adds new entries to the current runtime context without removing existing ones.

Parameters: JSON array of strings.
Return: none.

Example:

C++

pp->invoke("add-context", {{ "context", {"c 2 7", "x 9 9"} }});

Android

Java

JSONObject params = new JSONObject();
params.put("context", com.vivoka.vsdk.util.JsonUtils.makeJsonArray("c 2 7", "x 9 9"));
pp.invoke("add-context", params);

3. `remove-context`

Removes the specified entries from the current runtime context.

Parameters: JSON array of strings.
Return: none.

Example:

C++

pp->invoke("remove-context", {{ "context", {"a 1 3"} }});

Android

Java

JSONObject params = new JSONObject();
params.put("context", com.vivoka.vsdk.util.JsonUtils.makeJsonArray("a 1 3"));
pp.invoke("remove-context", params);

4. `clear-context`

Clears the current runtime context completely.

Parameters: unused.
Return: none.

Example:

C++

pp->invoke("clear-context", {});

Android

Java

pp.invoke("clear-context", new JSONObject());

5. `load-context`

Loads a context list from a file and replaces the current runtime context.

Parameters: JSON string with the path to the file. Path to the file is relative to the location of the configuration file.
Return: none.

Example:

C++

pp->invoke("load-context", {{ "file", "context.txt" }});

Android

Java

JSONObject params = new JSONObject();
params.put("file", "context.txt");
pp.invoke("load-context", params);

6. `set-accent`

Sets the user’s accent at runtime.

Parameters: JSON string with the accent name. Use an empty string if unknown.
Return: none.

Example:

C++

pp->invoke("set-accent", {{ "accent", "eng" }});
pp->invoke("set-accent", {{ "accent", "" }});  // reset to unknown

Android

Java

JSONObject params = new JSONObject();
params.put("accent", "eng");
pp.invoke("set-accent", params);

params.put("accent", ""); // reset to unknown
pp.invoke("set-accent", params);

7. `get-accent`

Retrieves the currently configured accent.

Parameters: unused.
Return: JSON string with the accent name, or empty string if unknown.

Example:

C++

auto const accent = pp->invoke("get-accent", {})["accent"].get<std::string>();

Android

Java

JSONObject result = pp.invoke("get-accent", new JSONObject());
result.getString("accent");

8. `list-accent`

Retrieves available accents for this model.

Parameters: unused.
Return: JSON array of strings.

Example:

C++

auto const availableAccents = pp->invoke("list-accent", {})["accents"];

Android

Java

JSONObject result = pp.invoke("list-accent", new JSONObject());
result.getString("accents");

9. `version`

Retrieves the version of the VEC module.

Parameters: unused.
Return: JSON string with the version in the form of X.Y.Z.

Example:

C++

auto const version = pp->invoke("version", {})["version"].get<std::string>();

Android

Java

JSONObject result = pp.invoke("version", new JSONObject());
result.getString("version");

Behavior Notes

All operations are lightweight; context updates and accent changes take effect immediately.
Changes remain until explicitly overridden or until the recognizer is destroyed.

Limitations

VEC is powerful but has clear boundaries:

Sequence length: Only supports 1–7 alphanumeric characters. Longer sequences are not handled. If you need to go beyond this limit, feel free to contact us.
Letter case: you have to use lowercase letter, never uppercase one. We forced this because the underlying engine may only understand specific pronunciation when uppercase letter are used. For example in English you may have to say ‘Capital A’ and not just ‘A' or for French you may have to say 'A majuscule’. To avoid any misconfiguration VEC only accepts lowercase letter.
Mode availability: Currently limited to alphanumeric mode. Free-speech mode is planned but not yet available.
Language dependence: Requires the correct acoustic and lexicon. Using mismatched resources reduces or eliminates accuracy gains.
Error types: Optimized for typical letter/digit confusions. It does not correct arbitrary word errors.

System Requirements

How to enable it?

Install the add-on library

Grammar changes

Configuration changes

Context list

Why Context Helps

Format

Size & Performance

Behavior

Changing Context / Accent at Runtime

API Usage

Supported Commands

1. set-context

2. add-context

3. remove-context

4. clear-context

5. load-context

6. set-accent

7. get-accent

8. list-accent

9. version

Behavior Notes

Limitations

1. `set-context`

2. `add-context`

3. `remove-context`

4. `clear-context`

5. `load-context`

6. `set-accent`

7. `get-accent`

8. `list-accent`

9. `version`