A speech recognizer built on Vosk that can be run on the browser, inspired by vosk-browser, but built from scratch and no code taken!
Browser-recognizer can run both in the browser main thread and web workers
The API is also designed with strong exception safety

Global and all objects' common interface

Function signature (global)	Description
`Promise<Model> makeModel(path: string, url: string, id: string)` `Promise<SpkModel> makeSpkModel(path: string, url: string, id: string)`	Make a `Model` or `SpkModel` - If path contains valid model files and id is the same, there will not be a fetch from url. - If path doesn't contain valid model files, or if it contains valid model files but id is different, there will be a fetch from url, and the model is stored with id.
`Promise<Recognizer> makeRecognizer(model: Model, sampleRate: float)`	Make a `Recognizer`, it will use a separate thread for recognition
`setLogLevel(lvl: int)`	Set Vosk's log level (default: -1) - 2: Error - 1: Warning - 0: Info - 1: Verbose - 2: More verbose - 3: Debug
`deleteAll()`	Call `delete()` on all objects, it is recommended to run this at the API usage end to automatically clean up everything. See why.

Function signature (all objects)	Description
`delete()`	Delete this object

`Recognizer` object

Function signature	Description
`Promise<AudioWorkletNode> getNode(ctx: AudioContext, channelIndex = 0: int)`	Get a pass-through node that recognize audio and is connectable to a processing graph. It has 1 input and 1 output, channelIndex must point to a 16-bit mono channel of the input
`recognize(buf: AudioBuffer, channelIndex = 0: int)`	Recognize an AudioBuffer, usually from something like `BaseAudioContext.decodeAudioData()`, channelIndex must point to a 16-bit mono channel of buf
`setPartialWords(partialWords: bool)`	Return words' information in a partialResult event (default: false)
`setWords(words: bool)`	Return words' information in a result event (default: false)
`setNLSML(nlsml: bool)`	Return result and partialResult in NLSML form (default: false)
`setMaxAlternatives(alts: int)`	Set the max number of alternatives for result event (default: false)
`setGrm(grm: string)`	Add grammar to the recognizer (default: none)
`setSpkModel(mdl: SpkModel)`	Set the speaker model of the recognizer (default: none)

Event	Description
`partialResult`	There is a partial recognition result, check the event's "details" property
`result`	There is a full recognition result, check the event's "details" property

Compilation

Changing any setting to non-default values requires recompilation

git clone --depth=1 https://github.com/msqr1/Browser-recognizer &&
cd Browser-recognizer &&
[Options] ./compile.sh

Option	Description	Default value
MAX_MEMORY	Set max memory, valid suffixes: kb, mb, gb, tb or none (bytes)	`300mb`, as recommended
MAX_THREADS	Set the max number of thread (2 min)	`2` (1 OPFS thread + 1 recognizer thread)
COMPILE_JOBS	Set the number of jobs (threads) when compiling	`$(nproc)`
EMSDK	Set EMSDK's path (will install EMSDK in root folder if unset)	`.`

Response headers

Browser-recognizer require SharedArrayBuffer, so these response headers must be set:

Cross-Origin-Embedder-Policy ---> require-corp
Cross-Origin-Opener-Policy ---> same-origin

If you can't set them, you may use a VERY HACKY workaround at src/addCOI.js.

Additions to vosk-browser:

Download multiple models
Model storage path management (when many models are required)
Model ID management (when model updates are required)

Usage

<!--Load this from a script tag-->
<script src="BrowserRecognizer.js"></script>
<!-->
<script>
  // Select name
  const BrRec = await loadBR()

  // Prepare 
  const model = await BrRec.makeModel(")
  const recognizer = await BrRec.makeRecognizer(model)
  recognizer.addEventListener("result", e => {
    console.log("Result: ",e.details)
  })
  recognizer.addEventListener("partialResult", e => {
    console.log("Partial result: ",e.details)
  })

  // Process audio
  media = await navigator.mediaDevices.getUserMedia({
    video: false,
    audio: {
      echoCancellation: true,
      noiseSuppression: true,
      channelCount: 1,
      sampleRate: 16000
    },
  });

</script>

README.md

Browser-recognizer