Files
Vosklet/API.md
2024-08-19 18:53:52 -07:00

5.9 KiB

API reference

JS window object

Function/Object Description
Promise<Module> loadVosklet() Load Vosklet module interface

Shared interface

Function/Object Description
delete() Delete this object, see why this is neccessary.

Module object

Function/Object Description
Promise<Model> createModel(url: string, path: string, id: string)

Promise<SpkModel> createSpkModel(url: string, path: string, id: string)
Create a Model or SpkModel, model files must be directly under the model root, and compressed model must be in .tar.gz/.tgz format. Tar format must be USTAR. If:
- path contains valid model files and id is the same, there will not be a fetch from url.
- path doesn't contain valid model files, or if it contains valid model files but id is different, there will be a fetch from url, and the model is stored with id. Models are thread-safe and reusable across recognizers.
Promise<Recognizer> createRecognizer(model: Model, sampleRate: float)

Promise<Recognizer> createRecognizerWithSpkModel(model: Model, spkModel: spkModel, sampleRate: float)

Promise<Recognizer> createRecognizerWithGrm(model: Model, grammar: string, sampleRate: float)
Create a Recognizer, it will reuse the thread from model if it's the first user of model, else it will use a new thread.
setLogLevel(lvl: int) Set log level for Kaldi messages (default: 0: Info)
-2: Error
-1: Warning
1: Verbose
2: More verbose
3: Debug
Promise<AudioWorkletNode> createTransferer(ctx: AudioContext, bufferSize: int) Create a node that transfer its inputs back to the main thread with custom buffer size (must be multiple of 128). Its port's onmessage handler can be set to get audio data. Has 1 input with 1 channel and no output. The the higher the size, the lesser the audio breaks up, but the higher the latency. Recomended value is around 128 * 150.
cleanUp() A convenience function that call delete() on all objects and revoke all URLs. Put this at the end of your code!
EpMode Enum for endpointer modes

Model object

Function/Object Description
int findWord(word: string) Check if a word can be recognized by the model, return the word symbol if word exists inside the model or -1 otherwise. Word symbol 0 is for epsilon.

Recognizer object

Function/Object Description
acceptWaveform(audioData: Float32Array) Accept voice data in a Float32Array with elements from -1.0 to 1.0.
setWords(words: bool) Enables words with times in the output (default: false)
setPartialWords(partialWords: bool) Like above return words and confidences in partial results (default: false)
setNLSML(nlsml: bool) Set NLSML output (default: false)
setMaxAlternatives(alts: int) Configures recognizer to output n-best results (default: 0)
setGrm(grm: string) Reconfigures recognizer to use grammar
setSpkModel(mdl: SpkModel) Adds speaker model to already initialized recognizer
setEndpointerMode(mode: EpMode) Set endpointer scaling factor (default: ANSWER_DEFAULT)
setEndpointerDelays(tStartMax: float, tEnd: float, tMax: float) Set endpointer delays
Event Description
partialResult There is a partial recognition result, check the event's detail property
result There is a full recognition result, check the event's detail property

Response headers

SharedArrayBuffer

SharedArrayBuffer is necessary to share data between threads, so these response headers must be set:

  • Cross-Origin-Embedder-Policyrequire-corp
  • Cross-Origin-Opener-Policysame-origin If you can't set them, you may use a hacky workaround in AddCOI.js.

CSP headers

Pthread worker construction must be from a blob (see Emscripten issue), so the CSP:

  • worker-src must include blob:

Model headers

Model response from fetch() must be an uncompressed model. Set your Content-Encoding response header and Accept-Encoding request header appropriately so browers can decompress.

Compilation

  • Requires all Autotools commands in PATH, make, and pkg-config. For example, installing with apt would be:

    sudo apt install autotools-dev autoconf libtool make pkg-config

  • Changing any option to non-default values requires recompilation

  • To remake a specific target, erase its directory in the repo root and run ./make again. Doing this will also remake the final JS

git clone --depth=1 https://github.com/msqr1/Vosklet &&
cd Vosklet/src &&
[Options] ./make
# Example: INITIAL_MEMORY=350mb MAX_THREADS=3 ./make
Option Description Default value
INITIAL_MEMORY Set inital memory, valid suffixes: kb, mb, gb, tb or none (bytes) 300mb as recommended. This memory will grow if usage exceeds this value, but this may affect performance.
MAX_THREADS Set the max number of threads (>=1), this should be equal to the number of model and speaker model that is used in the program 1 (1 recognizer, 1 model, no speaker model)
JOBS Set the number of jobs (threads) when building $(nproc)
EMSDK Set EMSDK's path (will install EMSDK in root folder if unset) ../emsdk