From f2584323cba5aa8b3fcd1121d5b5b02201a6d214 Mon Sep 17 00:00:00 2001 From: msqr1 Date: Sun, 1 Sep 2024 12:31:03 -0700 Subject: [PATCH] Update documentation --- API.md => Documentation.md | 26 ++++++++++++++------------ Examples/README.md | 4 +++- Examples/fromMic.html | 13 ++++--------- Examples/fromWav.html | 8 ++------ README.md | 27 +++++++++++++-------------- 5 files changed, 36 insertions(+), 42 deletions(-) rename API.md => Documentation.md (87%) diff --git a/API.md b/Documentation.md similarity index 87% rename from API.md rename to Documentation.md index c82cac4..de8a463 100644 --- a/API.md +++ b/Documentation.md @@ -1,4 +1,5 @@ -# API reference +

API Reference

+ ## JS ```window``` object | Function/Object | Description | |---|---| @@ -13,7 +14,7 @@ | Function/Object | Description | |---|---| | ```Promise createModel(url: string, path: string, id: string)```

```Promise createSpkModel(url: string, path: string, id: string)``` | Create a ```Model``` or ```SpkModel```, model files must be directly under the model root, and compressed model must be in ```.tar.gz```/```.tgz``` format. Tar format must be USTAR. If:
- ```path``` contains valid model files and ```id``` is the same, there will not be a fetch from ```url```.
- ```path``` doesn't contain valid model files, or if it contains valid model files but ```id``` is different, there will be a fetch from ```url```, and the model is stored with ```id```. Models are thread-safe and reusable across recognizers. | -| ```Promise createRecognizer(model: Model, sampleRate: float)```

```Promise createRecognizerWithSpkModel(model: Model, spkModel: spkModel, sampleRate: float)```

```Promise createRecognizerWithGrm(model: Model, grammar: string, sampleRate: float)``` | Create a ```Recognizer```, it will reuse the thread from ```model``` if it's the first user of ```model```, else it will use a new thread. | +| ```Promise createRecognizer(model: Model, sampleRate: float)```

```Promise createRecognizerWithSpkModel(model: Model, spkModel: spkModel, sampleRate: float)```

```Promise createRecognizerWithGrm(model: Model, grammar: string, sampleRate: float)``` | Create a ```Recognizer``` | | ```setLogLevel(lvl: int)``` | Set log level for Kaldi messages (default: ```0```: Info)
```-2```: Error
```-1```: Warning
```1```: Verbose
```2```: More verbose
```3```: Debug | | ```Promise createTransferer(ctx: AudioContext, bufferSize: int)``` | Create a node that transfer its inputs back to the main thread with custom buffer size (must be multiple of 128). Its port's ```onmessage``` handler can be set to get audio data. Has 1 input with 1 channel and no output. The the higher the size, the lesser the audio breaks up, but the higher the latency. Recomended value is around ```128 * 150```. | | ```cleanUp()``` | A convenience function that call ```delete()``` on all objects and revoke all URLs. **Put this at the end of your code!** | @@ -41,22 +42,23 @@ |---|---| | ```partialResult``` | There is a partial recognition result, check the event's ```detail``` property | | ```result``` | There is a full recognition result, check the event's ```detail``` property | +
+

HTTP Remarks

-# Response headers +## HTTPS +Vosklet is available only in [secure contexts](https://developer.mozilla.org/en-US/docs/Web/Security/Secure_Contexts) (HTTPS) ## SharedArrayBuffer -SharedArrayBuffer is necessary to share data between threads, so these response headers must be set: +SharedArrayBuffer is necessary to share data between workers, so these response headers must be set: - ```Cross-Origin-Embedder-Policy``` ⟶ ```require-corp``` - ```Cross-Origin-Opener-Policy``` ⟶ ```same-origin``` -If you can't set them, you may use a hacky workaround in *AddCOI.js*. +
If you can't set them, you may use a hacky workaround in *AddCOI.js*. -## CSP headers -Pthread worker construction must be from a blob (see [Emscripten issue](https://github.com/emscripten-core/emscripten/issues/21937)), so the CSP: +## Content Security Policy (CSP) +Wasm worker construction will be from a blob so the CSP: - ```worker-src``` must include ```blob:``` -## Model headers -Model response from ```fetch()``` must be an uncompressed model. Set your ```Content-Encoding``` response header and ```Accept-Encoding``` request header appropriately so browers can decompress. - -# Compilation +
+

Compilation

- Requires all Autotools commands in PATH, ```make```, and ```pkg-config```. For example, installing with ```apt``` would be: ```sudo apt install autotools-dev autoconf libtool make pkg-config``` @@ -70,7 +72,7 @@ cd Vosklet/src && ``` | Option | Description | Default value | |---|---|---| -| INITIAL_MEMORY | Set inital memory, valid suffixes: kb, mb, gb, tb or none (bytes) | ```300mb``` as [recommended](https://alphacephei.com/vosk/models). This memory will grow if usage exceeds this value, but this may [affect performance](https://github.com/WebAssembly/design/issues/1271). | +| INITIAL_MEMORY | Set inital memory, valid suffixes: kb, mb, gb, tb or none (bytes) | ```300mb``` as [recommended](https://alphacephei.com/vosk/models). This memory will grow if usage exceeds this value. | | MAX_THREADS | Set the max number of threads (>=1), this should be equal to the number of recognizers used in the program | ```1``` | | JOBS | Set the number of jobs (threads) when building | ```$(nproc)``` | | EMSDK | Set EMSDK's path (will install EMSDK in root folder if unset) | ```../emsdk``` | diff --git a/Examples/README.md b/Examples/README.md index 115f7d7..434f136 100644 --- a/Examples/README.md +++ b/Examples/README.md @@ -1 +1,3 @@ -**Note: Examples in this folder uses its own *Vosklet.js* because I can't set the Response headers for my model for browsers to decompress correctly. Instead, I used DecompressionStream to decompress manually, so this *Vosklet.js* only works for the examples. In production, please use the top-level Vosklet.js instead.** \ No newline at end of file +#### The file Vosklet.js in this folder, used by the examples and the outer [README.md](../README.md), has been set to decompress manually using ```DecompressionStream``` because I can't set a third-party (Github's) server response header. You can utilize this if you run into the same situation. Otherwise, please use the outer Vosklet.js instead. + +#### The motivation is that it will work right away when put into a HTML file. You can just make a local copy and everything out quickly \ No newline at end of file diff --git a/Examples/fromMic.html b/Examples/fromMic.html index 8dc5b43..85f4f98 100644 --- a/Examples/fromMic.html +++ b/Examples/fromMic.html @@ -24,20 +24,15 @@ let recognizer = await module.createRecognizer(model, 16000) // Listen for result and partial result - recognizer.addEventListener("result", ev => { - console.log("Result: ", ev.detail) - }) - recognizer.addEventListener("partialResult", ev => { - console.log("Partial result: ", ev.detail) - }) + recognizer.addEventListener("result", ev => console.log("Result: ", ev.detail)) + recognizer.addEventListener("partialResult", ev => console.log("Partial result: ", ev.detail)) // Create a transferer node to get audio data on the main thread let transferer = await module.createTransferer(ctx, 128 * 150) // Recognize data on arrival - transferer.port.onmessage = ev => { - recognizer.acceptWaveform(ev.data) - } + transferer.port.onmessage = ev => recognizer.acceptWaveform(ev.data) + // Connect to microphone micNode.connect(transferer) } diff --git a/Examples/fromWav.html b/Examples/fromWav.html index 3c3a826..9bef99a 100644 --- a/Examples/fromWav.html +++ b/Examples/fromWav.html @@ -11,12 +11,8 @@ let recognizer = await module.createRecognizer(model, 16000) // Listen for result and partial result - recognizer.addEventListener("result", ev => { - console.log("Result: ", ev.detail) - }) - recognizer.addEventListener("partialResult", ev => { - console.log("Partial result: ", ev.detail) - }) + recognizer.addEventListener("result", ev => console.log("Result: ", ev.detail)) + recognizer.addEventListener("partialResult", ev => console.log("Partial result: ", ev.detail)) // Fetch, decode, and recognize .wav let wav = await fetch("https://cdn.jsdelivr.net/gh/msqr1/Vosklet/examples/example.wav") diff --git a/README.md b/README.md index 76c9e92..f0a13c9 100644 --- a/README.md +++ b/README.md @@ -1,18 +1,22 @@ # Overview - A lightweight, up to date speech recognizer in the browser with total gzipped size of **under a megabyte** (725 KB) -- Built from scratch, inspired by [vosk-browser](https://github.com/ccoreilly/vosk-browser) +- Demo: +- Inspired by [vosk-browser](https://github.com/ccoreilly/vosk-browser) + +# Documentation +- See [Documentation.md](Documentation.md) # Vosklet ... - Is regularly maintained - Support multiple models -- Include model storage management -- Include model ID management (for updates) +- Include model cache path management +- Include model cache ID management (for updates) - Wraps all Vosk's functionaly # Basic usage (microphone recognition in English) - Result are logged to the console. -- Copied from *examples/fromMic.html* -- **Note: The example folder and this piece of code uses *Examples/Vosklet.js* because I can't set the Response headers for my model for browsers to decompress correctly. Instead, I used DecompressionStream to decompress manually, so *Examples/Vosklet.js* only works for the examples. In production, use the top-level Vosklet.js instead.** +- Copied from [Examples/fromMic.html](Examples/fromMic.html) +- **IMPORTANT:** Please see [Examples/README.md](Examples/README.md) ```html @@ -40,20 +44,15 @@ let recognizer = await module.createRecognizer(model, 16000) // Listen for result and partial result - recognizer.addEventListener("result", ev => { - console.log("Result: ", ev.detail) - }) - recognizer.addEventListener("partialResult", ev => { - console.log("Partial result: ", ev.detail) - }) + recognizer.addEventListener("result", ev => console.log("Result: ", ev.detail)) + recognizer.addEventListener("partialResult", ev => console.log("Partial result: ", ev.detail)) // Create a transferer node to get audio data on the main thread let transferer = await module.createTransferer(ctx, 128 * 150) // Recognize data on arrival - transferer.port.onmessage = ev => { - recognizer.acceptWaveform(ev.data) - } + transferer.port.onmessage = ev => recognizer.acceptWaveform(ev.data) + // Connect to microphone micNode.connect(transferer) }