Configuration file

The table below describe the VSDK configuration file located at config/vsdk.json:

Field

Description/Notes

Optional

Default Value

Type

Possible Values

version

Version of the whole document


String

2.0

csdk



Object


csdk/paths



Object


csdk/paths/cache

Absolute or relative to vsdk.json

cache

Path


csdk/paths/data_root

Absolute or relative to vsdk.json

.

Path


csdk/paths/acmod

Absolute or relative to data_root

acmod

Path


csdk/paths/asr

Absolute or relative to data_root

asr

Path


csdk/paths/clc

Absolute or relative to data_root

clc

Path


csdk/paths/clc_ruleset

Absolute or relative to data_root

clc

Path


csdk/paths/dictionary

Absolute or relative to data_root

dictionaries

Path


csdk/paths/search

Absolute or relative to data_root

ctx

Path


csdk/paths/sem3

Absolute or relative to data_root

ctx

Path


csdk/paths/users

Absolute or relative to data_root

users

Path


csdk/paths/audio_based_classifier_model

Absolute or relative to data_root

abc

Path


csdk/paths/confusion_dictionary

Absolute or relative to data_root

dictionaries

Path


csdk/paths/language_model

Absolute or relative to data_root

lm

Path


csdk/tts



Object


csdk/tts/channels



Object


csdk/tts/channels/<channel_name_1>

Name of the channel, used in code


String


csdk/tts/channels/<channel_name_1>/voices



Array


csdk/tts/channels/<channel_name_1>/voices/0



String

<speaker>,<lang>,<quality>

csdk/asr



Object


csdk/asr/recognizers



Object


csdk/asr/recognizers/<recognizer_name_1>

Name of the recognizer, used in code


String


csdk/asr/recognizers/<recognizer_name_1>/acmods

Recognizers accept multiple acoustic models


Array


csdk/asr/recognizers/<recognizer_name_1>/acmods/0



String


csdk/asr/recognizers/<recognizer_name_1>/conformer

Name of the conformer (Neural ASR only)


String


csdk/asr/recognizers/<recognizer_name_1>/settings

Recognizer settings


Array


csdk/asr/recognizers/<recognizer_name_1>/settings/LH_END2ENDSERVICESSTREAMING_PARAM_TRAILINGSILENCE

[Conformer only] Minimum amount of trailing silence, in milliseconds, before the end of an utterance is detected..


Int


csdk/asr/recognizers/<recognizer_name_1>/settings/LH_END2ENDSERVICESSTREAMING_PARAM_MAXLEADINGSILENCE

[Conformer only] Maximum amount of leading silence, in milliseconds, to process at the beginning of an utterance.


Int


csdk/asr/recognizers/<recognizer_name_1>/settings/LH_END2ENDSERVICESSTREAMING_PARAM_INTERMEDIATERESULTSINTERVAL

[Conformer only] The interval, in milliseconds, intermediate results are checked and possibly sent.


Int


csdk/asr/recognizers/<recognizer_name_1>/settings/LH_END2ENDSERVICESSTREAMING_PARAM_POSTFLUENTINTERMEDIATERESULTDURATION

[Conformer only] Specifies the duration of a period following LH_END2ENDSERVICESSTREAMING_PARAM_FLUENTINTERMEDIATERESULTDURATION


Int


csdk/asr/recognizers/<recognizer_name_1>/settings/LH_END2ENDSERVICESSTREAMING_PARAM_POSTFLUENTINTERMEDIATERESULTINTERVAL

[Conformer only] The interval, in milliseconds, intermediate results are checked and possibly sent.


Int


csdk/asr/recognizers/<recognizer_name_1>/settings/LH_END2ENDSERVICESSTREAMING_PARAM_MAXNBEST

[Conformer only] The maximum number of hypotheses that should be returned in a result


Int


csdk/asr/recognizers/<recognizer_name_1>/settings/LH_END2ENDSERVICESSTREAMING_PARAM_EAGERRESULTMINSILENCEMS

[Conformer only] Minimum amount of trailing silence, in milliseconds, before an early result is triggered.


Int


csdk/asr/recognizers/<recognizer_name_1>/settings/LH_END2ENDSERVICESSTREAMING_PARAM_EAGERRESULTADDITIONALAUDIOMS

[Conformer only] Amount of audio signal, in milliseconds, added after the eager result trigger.


Int


csdk/asr/recognizers/<recognizer_name_1>/settings/LH_END2ENDSERVICESSTREAMING_PARAM_SHOWCANDIDATERESULTS

[Conformer only] If set to LH_TRUE, candidate results will be generated.


String


csdk/asr/recognizers/<recognizer_name_1>/settings/LH_END2ENDSERVICESSTREAMING_PARAM_STRONGEAGERRESULTMINSILENCEMS

[Conformer only] Minimum amount of trailing silence, in milliseconds, before an strong early result is triggered.


Int


csdk/asr/recognizers/<recognizer_name_1>/settings/LH_END2ENDSERVICESSTREAMING_PARAM_SPEECH_TIMEOUT

[Conformer only] Speech duration timeout in milliseconds.


Int


csdk/asr/recognizers/<recognizer_name_1>/settings/LH_END2ENDSERVICESSTREAMING_PARAM_AUTOMATICRESTART

[Conformer only] Defines whether recognition resumes automatically after a final result is produced.


String


csdk/asr/recognizers/<recognizer_name_1>/settings/LH_END2ENDSERVICESSTREAMING_PARAM_FLUENTINTERMEDIATERESULTDURATION

[Conformer only] Specifies the duration about intermediate results.


Int


csdk/asr/recognizers/<recognizer_name_1>/settings/LH_END2ENDSERVICESSTREAMING_PARAM_EAGERRESULTEXTRASILENCEMS

[Conformer only]


Int


csdk/asr/models



Object


csdk/asr/models/<model_name_1>

Name of the model, used in code


String


csdk/asr/models/<model_name_1>/type



String

static, dynamic,
free-speech, neural

csdk/asr/models/<model_name_1>/conformer

Name of the corresponding conformer (Neural model only)


String


csdk/asr/models/<model_name_1>/file

Compiled model file name, extension is .fcf


File


csdk/asr/models/<model_name_1>/sem3

Compiled semantic model file name, extension is .s3c

""

String


csdk/asr/models/<model_name_1>/settings



Object


csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_STREAM_RESULT_MODE

The mode in which intermediate results are displayed during recognition. 1 means partial result are activated

0

Int

0, 1

csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_ACCURACY

Trade-off between CPU- load, memory requirements and the obtained accuracy of the search

10000

Int

[100 ; 50000]

csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_MAXNBEST

Maximum number of hypotheses returned in a result

3

Int

[1 ; 1000]

csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_TSILENCE

Minimum amount of trailing silence, in milliseconds. Use a higher value for non- WUW models

100

Int

[100 ; 10000]

csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_IG_LOWCONF

Maximum amount of confidence level that indicates that a spoken utterance is out of grammar

5000

Int

[0 ; 10000]

csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_IG_HIGHCONF

Minimum amount of confidence level that indicates that a spoken utterance is in grammar

5000

Int

[0 ; 10000]

csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_INITBEAMWIDTH

Init beam width. This parameter affects low- level behavior of the algorithm

2500

Int

[0 ; 10000]

csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_TANYSPEECH

Allows the recognizer to stop the recognition process during the trailing AnySpeech state

LH_FALSE

String

LH_TRUE, LH_FALSE

csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_NBESTRESULT_SETHIDDENKEYS

When enabled additional information is included on the ASR result that can be used for the FM use case

LH_FALSE

String

LH_TRUE, LH_FALSE

csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_ONDEMANDLOADING

Context on-demand loading

LH_FALSE

Int

LH_TRUE, LH_FALSE

csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_SPEECH_TIMEOUT

Speech duration timeout in milliseconds

0

Int

0, [100 ; 60000]

csdk/asr/models/<model_name_1>/acmod

∗Only for dynamic models. Must match with the one configured on the recognizer that will apply this model

✘∗


String


csdk/asr/models/<model_name_1>/slots

∗Only for dynamic models.

✘∗


Object


csdk/asr/models/<model_name_1>/slots/<slot_name_1>

Name of the slot, used in the code


Object


csdk/asr/models/<model_name_1>/slots/<slot_name_1>/values

Values of the given slot


Array


csdk/asr/models/<model_name_1>/slots/<slot_name_1>/category


normal

String

normal, name, artist

csdk/asr/models/<model_name_1>/slots/<slot_name_1>/allow_custom_phonetic

Setting to true will allow for custom phonetics to be provided for this slot

false

Bool


csdk/asr/models/<model_name_1>/lexicon

∗Only for dynamic models.

✘∗


String


csdk/asr/models/<model_name_1>/lexicon/clc

Used during runtime compilation. Use a language that match the rest of the grammar and the recognizer this model will be applied on


File


csdk/asr/models/<model_name_1>/lexicon/settings



Object


csdk/asr/models/<model_name_1>/extra_models

∗Only for free-speech models. All models for a given language must be listed or the program won't function properly

✘∗


Object


csdk/asr/models/<model_name_1>/extra_models/<name>



File


vnlu



Object


vnlu/paths

Object that contains the different path required by the engine to load specific resources


Object


vnlu/paths/data_root

Base location where the engine will load resources from. If relative, it will be based on the configuration file base path

.

String


vnlu/paths/parser

Directory where nlu models (.vum) will be loaded from. If relative, it will be based on the data_root path

parsers

String


vnlu/paths/log

Directory where log will be placed. If relative, it will be based on the data_root path

log

String


vnlu/log

Object that contains all the logging configuration of VASR


Object


vnlu/log/<name>

Configuration for the logger <name>. Possible logger names are [vasr (the default logger), perf (the performance logger), recognizer:<name> (the recognizer named name), model:<name> (the model named name)]. It is also possible to use wildcard as specific places to configure multiple logger at once. Here are the allowed values: [*, recognizer:*, model:*]

vasr

Object

[vasr, perf, recognizer:<name>, model:<name>]

vnlu/log/<name>/level

Set the minimal log level for this logger. Note that setting this will print the level specified and everything above the level e.g. setting the level to warning will still print error and critical messages. Of course warn and warning / err and error are equivalent

info

String

[trace, debug, info, warning, warn, error, err, critical, off]

vnlu/log/<name>/sink

Set the destination of log messages for this logger

stdout-color

String

[stdout, stdout-color, stderr, stderr-color, file]

vnlu/log/<name>/sink_options

Contains all options specific to the type of sink


Object


vnlu/log/<name>/sink_options/path

(For file sink type only) Set the output file path. If relative, it will be based on the paths/log path


String


vnlu/log/<name>/sink_options/max_size

(For file sink type only) Specify the maximum file size (in kbytes) before rotating the file


Int

[1024 - inf]

vnlu/log/<name>/sink_options/max_file

(For file sink type only) Specify the maximum number of log file that will be kept


Int

[1 - inf]

vnlu/log/<name>/sink_options/truncate

(For file sink type only) If set to true file will be truncate / rotate (depending if the maxsize / maxfile options have been specified or not) when the engine is created

false

Boolean


vnlu/nlu

Object that contains all the configuration of VNLU


Object


vnlu/nlu/parsers

List here all parsers that will be created during run-time


Object


vnlu/nlu/parsers/<name>

Specify the definition of a parser. The name is completely free and will be used in the code to create the Parser object


Object


vnlu/nlu/parsers/<name>/model

Path to the nlu model file (.vum) that will be loaded by the Parser. If relative, it will be based on the paths/parser path


String


tssv



Object


tssv/biometrics



Object


tssv/biometrics/generated_models_path

Absolute or relative to the program's working directory


Path


tssv/biometrics/background_model_TD

Absolute or relative to the program's working directory


File


tssv/biometrics/background_model_TI

Absolute or relative to the program's working directory


File


idvoice



Object


idvoice/biometrics



Object


idvoice/biometrics/generated_models_path

Absolute or relative to the program's working directory


Path


idvoice/biometrics/background_model_TD

Absolute or relative to the program's working directory


File


idvoice/biometrics/background_model_TI

Absolute or relative to the program's working directory


File


s2c



Object


s2c/speech_enhancement



Object


s2c/speech_enhancement/speech_enhancers



Object


s2c/speech_enhancement/speech_enhancers/<name>

Enhancer name. The name is completely free and will be used in the code to create the Enhancer object


Object


s2c/speech_enhancement/speech_enhancers/<name>/configuration

A JSON object that holds the Speech Enhancement configuration generated by vdk-studio


Object