Conﬁguration file

The table below describe the VSDK configuration file located at config/vsdk.json:

Field	Description/Notes	Optional	Default Value	Type	Possible Values
version	Version of the whole document	✘		String	2.0
csdk		✘		Object
csdk/paths		✔		Object
csdk/paths/cache	Absolute or relative to vsdk.json	✔	cache	Path
csdk/paths/data_root	Absolute or relative to vsdk.json	✔	.	Path
csdk/paths/acmod	Absolute or relative to data_root	✔	acmod	Path
csdk/paths/asr	Absolute or relative to data_root	✔	asr	Path
csdk/paths/clc	Absolute or relative to data_root	✔	clc	Path
csdk/paths/clc_ruleset	Absolute or relative to data_root	✔	clc	Path
csdk/paths/dictionary	Absolute or relative to data_root	✔	dictionaries	Path
csdk/paths/search	Absolute or relative to data_root	✔	ctx	Path
csdk/paths/sem3	Absolute or relative to data_root	✔	ctx	Path
csdk/paths/users	Absolute or relative to data_root	✔	users	Path
csdk/paths/audio_based_classifier_model	Absolute or relative to data_root	✔	abc	Path
csdk/paths/confusion_dictionary	Absolute or relative to data_root	✔	dictionaries	Path
csdk/paths/language_model	Absolute or relative to data_root	✔	lm	Path
csdk/tts		✘		Object
csdk/tts/channels		✘		Object
csdk/tts/channels/<channel_name_1>	Name of the channel, used in code	✘		String
csdk/tts/channels/<channel_name_1>/voices		✘		Array
csdk/tts/channels/<channel_name_1>/voices/0		✘		String	<speaker>,<lang>,<quality>
csdk/asr		✘		Object
csdk/asr/recognizers		✘		Object
csdk/asr/recognizers/<recognizer_name_1>	Name of the recognizer, used in code	✘		String
csdk/asr/recognizers/<recognizer_name_1>/acmods	Recognizers accept multiple acoustic models	✘		Array
csdk/asr/recognizers/<recognizer_name_1>/acmods/0		✘		String
csdk/asr/models		✘		Object
csdk/asr/models/<model_name_1>	Name of the model, used in code	✘		String
csdk/asr/models/<model_name_1>/type		✘		String	static, dynamic, free-speech
csdk/asr/models/<model_name_1>/file	Compiled model ﬁle name, extension is .fcf	✘		File
csdk/asr/models/<model_name_1>/sem3	Compiled semantic model ﬁle name, extension is .s3c	✔	""	String
csdk/asr/models/<model_name_1>/settings		✔		Object
csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_STREAM_RESULT_MODE	The mode in which intermediate results are displayed during recognition. 1 means partial result are activated	✔	0	Int	0, 1
csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_ACCURACY	Trade-off between CPU- load, memory requirements and the obtained accuracy of the search	✔	10000	Int	[100 ; 50000]
csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_MAXNBEST	Maximum number of hypotheses returned in a result	✔	3	Int	[1 ; 1000]
csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_TSILENCE	Minimum amount of trailing silence, in milliseconds. Use a higher value for non- WUW models	✔	100	Int	[100 ; 10000]
csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_IG_LOWCONF	Maximum amount of conﬁdence level that indicates that a spoken utterance is out of grammar	✔	5000	Int	[0 ; 10000]
csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_IG_HIGHCONF	Minimum amount of conﬁdence level that indicates that a spoken utterance is in grammar	✔	5000	Int	[0 ; 10000]
csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_INITBEAMWIDTH	Init beam width. This parameter affects low- level behavior of the algorithm	✔	2500	Int	[0 ; 10000]
csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_TANYSPEECH	Allows the recognizer to stop the recognition process during the trailing AnySpeech state	✔	LH_FALSE	String	LH_TRUE, LH_FALSE
csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_NBESTRESULT_SETHIDDENKEYS	When enabled additional information is included on the ASR result that can be used for the FM use case	✔	LH_FALSE	String	LH_TRUE, LH_FALSE
csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_ONDEMANDLOADING	Context on-demand loading	✔	LH_FALSE	Int	LH_TRUE, LH_FALSE
csdk/asr/models/<model_name_1>/settings/LH_SEARCH_PARAM_SPEECH_TIMEOUT	Speech duration timeout in milliseconds	✔	0	Int	0, [100 ; 60000]
csdk/asr/models/<model_name_1>/acmod	∗Only for dynamic models. Must match with the one conﬁgured on the recognizer that will apply this model	✘∗		String
csdk/asr/models/<model_name_1>/slots	∗Only for dynamic models.	✘∗		Object
csdk/asr/models/<model_name_1>/slots/<slot_name_1>	Name of the slot, used in the code	✘		Object
csdk/asr/models/<model_name_1>/slots/<slot_name_1>/values	Values of the given slot	✘		Array
csdk/asr/models/<model_name_1>/slots/<slot_name_1>/category		✔	normal	String	normal, name, artist
csdk/asr/models/<model_name_1>/slots/<slot_name_1>/allow_custom_phonetic	Setting to true will allow for custom phonetics to be provided for this slot	✔	false	Bool
csdk/asr/models/<model_name_1>/lexicon	∗Only for dynamic models.	✘∗		String
csdk/asr/models/<model_name_1>/lexicon/clc	Used during runtime compilation. Use a language that match the rest of the grammar and the recognizer this model will be applied on	✘		File
csdk/asr/models/<model_name_1>/lexicon/settings		✔		Object
csdk/asr/models/<model_name_1>/extra_models	∗Only for free-speech models. All models for a given language must be listed or the program won't function properly	✘∗		Object
csdk/asr/models/<model_name_1>/extra_models/<name>		✘		File
vasr		✘		Object
vasr/paths		✔		Object
vasr/paths/data_root	Absolute or relative to vsdk.json	✔	.	Path
vasr/paths/acmod	Absolute or relative to data_root	✔	acmod	Path
vasr/paths/graph	Absolute or relative to data_root	✔	graph	Path
vasr/paths/log	Directory where log will be placed. If relative, it will be based on the data_root path.	✔	log	String
vasr/log		✔		Object
vasr/log/<logger_name>		✔		Object	*, perf
vasr/log/<logger_name>/level	Level of the debugging information printed	✔		String	info, debug
vasr/log/<name>/sink	Set the destination of log messages for this logger. (Ignored on android)	✔	stdout-color	String	stdout, stdout-color, stderr, stderr-color, file
vasr/log/<name>/sink_options	Contains all options specific to the type of sink. (Ignored on android)	✔	N/A	Object
vasr/log/<name>/sink_options/path	(For file sink type only) Set the output file path. If relative, it will be based on the paths/log path.	✔		String
vasr/log/<name>/sink_options/max_size	(For file sink type only) Specify the maximum file size (in kbytes) before rotating the file.	✔		Int	[1'024 - inf]
vasr/log/<name>/sink_options/max_file	(For file sink type only) Specify the maximum file size (in kbytes) before rotating the file.	✔		Int	[1 - inf]
vasr/log/<name>/sink_options/truncate	(For file sink type only) If set to true file will be truncate / rotate (depending if the maxsize / maxfile options have been specified or not) when the engine is created.	✔	false	Boolean
vasr/asr	Object that contains all the configuration of VASR	✘		Object
vasr/asr/recognizers	List here all recognizer that will be created during run-time	✘		Object
vasr/asr/recognizers/<recognizer_name>	Name of the recognizer, used in the code	✘		Object
vasr/asr/recognizers/<recognizer_name>/acmods	Array containing all acmods used	✘		Array
vasr/asr/recognizers/<recognizer_name>/acmods/0	Acmod file name	✘		String
vasr/asr/recognizers/<recognizer_name>/settings	Object containing optional settings	✔		Object
vasr/asr/recognizers/<recognizer_name>/settings/min_speech_duration	Amount of time (in milliseconds) that speech needs to be detected before triggering a recognition. Note that after a successful trigger the X ms of audio of this value will also be passed on to the recognizer. It needs to be a multiple of 100	✔	500	Int	[100 - 10'000]
vasr/asr/recognizers/<recognizer_name>/settings/min_silence_duration	Minimum amount of trailing silence, in milliseconds. Use a higher value for non- WUW models. It needs to be a multiple of 100	✔	700	Int	[100 - 10'000]
vasr/asr/recognizers/<recognizer_name>/settings/padding_size	Amount of perfect silence (in milliseconds) that will be artificially added to the audio at its beginning and its end. This improves performance at almost no cost. Rarely needs to be modified.	✔		Int	[0 - 1'000]
vasr/asr/recognizers/<recognizer_name>/settings/speech_probability_threshold	Set the sensibility of the VAD. A Lower value means more prone to false positives and higher value to false negative	✔	0.5	Float	]0.0 - 1.0[
vasr/asr/recognizers/<recognizer_name>/settings/left_padding_size	Amount of perfect silence (in milliseconds) that will be artificially added to the audio at its beginning. This improves performance at almost no cost. Rarely needs to be modified. Note that this setting is only for free-speech mode. This setting takes priority over the `padding_size` one. Since VASR version 2.0.0 (VSDK-VASR version 4.0.0)	✔		Int	[0 - 1'000]
vasr/asr/recognizers/<recognizer_name>/settings/right_padding_size	Amount of perfect silence (in milliseconds) that will be artificially added to the audio at its end. This improves performance at almost no cost. Rarely needs to be modified. Note that this setting is only for free-speech mode. This setting takes priority over the `padding_size` one. Since VASR version 2.0.0 (VSDK-VASR version 4.0.0)	✔		Int	[0 - 1'000]
vasr/asr/recognizers/<recognizer_name>/settings/audio_cache_size	Specify the maximum amount of audio (in milliseconds) that a recognizer will keep in its internal buffer. The more audio it has, the higher memory consumption will be. The audio buffering is used for gapless applications. It needs to be a multiple of 100 Since VASR version 2.1.0 (VSDK-VASR version 4.2.0)	✔	10'000	Int	[0 - inf]
vasr/asr/recognizers/<recognizer_name>/settings/invalid_start_time_strategy	If, when calling the `Recognizer::setModels` function, you pass as a 2nd parameter an invalid start time (e.g. greater than the amount of audio present in the internal buffer), this setting will define how the Engine will react. With `warn_and_clamp`, a warning will be outputted and the actual startTime that will be used will be the best one possible given the circumstances. With `error_and_unset` an error message will be outputted and the model will not be set in this recognizer. Since VASR version 2.2.0 (VSDK-VASR version 5.2.0)	✔	warn_and_clamp	String	[warn_and_clamp, error_and_unset]
vasr/asr/recognizers/<name>/settings/intermediate_result_frequency	Amount of audio (in milliseconds) that needs to be send to the engine before it returns an intermediate result. A value of 0 completely disable intermediate result. Note that this value will only be an aproximation.	✔	750	Int	0 or [450 - 10'000]
vasr/asr/models		✘		Object
vasr/asr/models/<model_name_1>	Name of the model, used in the code	✘		String
vasr/asr/models/<model_name_1>/type		✘		String	static, dynamic, free-speech
vasr/asr/models/<model_name_1>/file	Absolute or relative to paths/models	✘		Path
vasr/asr/models/<model_name_1>/recognizer	∗Only for dynamic models. Absolute or relative to recognizer	✘∗		File
vasr/asr/models/<model_name_1>/slots	∗Only for dynamic models.	✘∗		Array
vasr/asr/models/<model_name_1>/slots/<slot_name_1>	Name of the slot, used in the code	✘		String
vasr/asr/models/<name>/settings/max_hypothesis	Set the number of hypothesis that will be returned by the engine for this model.	✔	3	Int	[1 - 10]
vnlu		✘		Object
vnlu/paths	Object that contains the different path required by the engine to load specific resources	✔		Object
vnlu/paths/data_root	Base location where the engine will load resources from. If relative, it will be based on the configuration file base path	✔	`.`	String
vnlu/paths/parser	Directory where nlu models (.vum) will be loaded from. If relative, it will be based on the data_root path	✔	parsers	String
vnlu/paths/log	Directory where log will be placed. If relative, it will be based on the data_root path	✔	log	String
vnlu/log	Object that contains all the logging configuration of VASR	✔		Object
vnlu/log/<name>	Configuration for the logger <name>. Possible logger names are [vasr (the default logger), perf (the performance logger), recognizer:<name> (the recognizer named name), model:<name> (the model named name)]. It is also possible to use wildcard as specific places to configure multiple logger at once. Here are the allowed values: [, recognizer:, model:*]	✔	vasr	Object	[vasr, perf, recognizer:<name>, model:<name>]
vnlu/log/<name>/level	Set the minimal log level for this logger. Note that setting this will print the level specified and everything above the level e.g. setting the level to warning will still print error and critical messages. Of course warn and warning / err and error are equivalent	✔	info	String	[trace, debug, info, warning, warn, error, err, critical, off]
vnlu/log/<name>/sink	Set the destination of log messages for this logger	✔	stdout-color	String	[stdout, stdout-color, stderr, stderr-color, file]
vnlu/log/<name>/sink_options	Contains all options specific to the type of sink	✔		Object
vnlu/log/<name>/sink_options/path	(For file sink type only) Set the output file path. If relative, it will be based on the paths/log path	✔		String
vnlu/log/<name>/sink_options/max_size	(For file sink type only) Specify the maximum file size (in kbytes) before rotating the file	✔		Int	[1024 - inf]
vnlu/log/<name>/sink_options/max_file	(For file sink type only) Specify the maximum number of log file that will be kept	✔		Int	[1 - inf]
vnlu/log/<name>/sink_options/truncate	(For file sink type only) If set to true file will be truncate / rotate (depending if the maxsize / maxfile options have been specified or not) when the engine is created	✔	false	Boolean
vnlu/nlu	Object that contains all the configuration of VNLU	✘		Object
vnlu/nlu/parsers	List here all parsers that will be created during run-time	✘		Object
vnlu/nlu/parsers/<name>	Specify the definition of a parser. The name is completely free and will be used in the code to create the Parser object	✘		Object
vnlu/nlu/parsers/<name>/model	Path to the nlu model file (.vum) that will be loaded by the Parser. If relative, it will be based on the paths/parser path	✘		String
baratinoo		✘		Object
baratinoo/paths		✔		Object
baratinoo/paths/data_root	Absolute or relative to vsdk.json	✔		Path
baratinoo/tts		✘		Object
baratinoo/tts/channels		✘		Object
baratinoo/tts/channels/<channel_name_1>		✘		Object
baratinoo/tts/channels/<channel_name_1>/voices		✘		Array
baratinoo/tts/channels/<channel_name_1>/voices/0		✘		String	<speaker>
vtapi		✘		Object
vtapi/paths		✔		Object
vtapi/paths/data_root	Absolute or relative to vsdk.json	✔		Path
vtapi/tts		✘		Object
vtapi/tts/channels		✘		Object
vtapi/tts/channels/<channel_name_1>		✘		Object
vtapi/tts/channels/<channel_name_1>/voices		✘		Array
vtapi/tts/channels/<channel_name_1>/voices/0		✘		String	<speaker>,<quality>
tssv		✘		Object
tssv/biometrics		✘		Object
tssv/biometrics/generated_models_path	Absolute or relative to the program's working directory	✘		Path
tssv/biometrics/background_model_TD	Absolute or relative to the program's working directory	✘		File
tssv/biometrics/background_model_TI	Absolute or relative to the program's working directory	✘		File
idvoice		✘		Object
idvoice/biometrics		✘		Object
idvoice/biometrics/generated_models_path	Absolute or relative to the program's working directory	✘		Path
idvoice/biometrics/background_model_TD	Absolute or relative to the program's working directory	✘		File
idvoice/biometrics/background_model_TI	Absolute or relative to the program's working directory	✘		File
s2c		✘		Object
s2c/speech_enhancement		✘		Object
s2c/speech_enhancement/speech_enhancers		✘		Object
s2c/speech_enhancement/speech_enhancers/<name>	Enhancer name. The name is completely free and will be used in the code to create the Enhancer object	✘		Object
s2c/speech_enhancement/speech_enhancers/<name>/configuration	A JSON object that holds the Speech Enhancement configuration generated by vdk-studio	✘		Object