Important changes
This commit is contained in:
66
README.md
66
README.md
@@ -1,7 +1,7 @@
|
||||
# Overview
|
||||
- A speech recognizer built on Vosk that can be run on the browser, inspired by [vosk-browser](https://github.com/ccoreilly/vosk-browser), but built from scratch and no code taken!
|
||||
- Designed with strong exception safety
|
||||
- See the examples folder for ways to use the API
|
||||
- See the usage folder's README.md for API documentation and important notes
|
||||
- See the devel folder for the newest build (not guaranteed to work) and the JS build script
|
||||
|
||||
# Additions to vosk-browser:
|
||||
@@ -11,67 +11,5 @@
|
||||
- Smaller JS size
|
||||
- Doesn't need another file when using AudioWorkletNode
|
||||
|
||||
# User agent notes
|
||||
## SharedArrayBuffer
|
||||
Browser-recognizer require SharedArrayBuffer to share thread's data, so these response headers must be set:
|
||||
- ***Cross-Origin-Embedder-Policy*** ---> ***require-corp***
|
||||
- ***Cross-Origin-Opener-Policy*** ---> ***same-origin***
|
||||
|
||||
If you can't set them, you may use a HACKY workaround at *src/addCOI.js*.
|
||||
|
||||
## Origin Private Filesystem (OPFS)
|
||||
Browser-recognizer needs the Emscripten WASMFS' OPFS to store its model, IDBFS was considered, but dropped because there is no direct way to read from IDBFS to C++ without copying to MEMFS (basically RAM). For safety with this, always:
|
||||
- Try catch ```window.loadBR()``` to to check for OPFS availability.
|
||||
- Check if there is enough space via ```navigator.storage.estimate()``` for TWICE THE MODEL SIZE before calling Module.makeModel
|
||||
|
||||
# API interface
|
||||
## JS ```window``` object
|
||||
| Function signature | Description |
|
||||
|---|---|
|
||||
|```Promise<Module> loadBR()``` | Load Emscripten's Module |
|
||||
|
||||
## Shared interface
|
||||
| Function signature | Description |
|
||||
|---|---|
|
||||
| ```delete()``` | Delete this object, see [why](https://emscripten.org/docs/getting_started/FAQ.html#what-does-exiting-the-runtime-mean-why-don-t-atexit-s-run) this is neccessary.
|
||||
|
||||
## ```Module``` object
|
||||
| Function signature | Description |
|
||||
|---|---|
|
||||
| ```Promise<Model> makeModel(path: string, url: string, id: string)```<br><br>```Promise<SpkModel> makeSpkModel(path: string, url: string, id: string)``` | Make a ```Model``` or ```SpkModel```<br>- If **path** contains valid model files and **id** is the same, there will not be a fetch from **url**.<br>- If **path** doesn't contain valid model files, or if it contains valid model files but **id** is different, there will be a fetch from **url**, and the model is stored with **id**. Model files must be directly under the model root folder, and compressed model must be in .tgz format. |
|
||||
| ```Promise<Recognizer> makeRecognizer(model: Model, sampleRate: float)``` | Make a ```Recognizer```, it will use **model**'s thread if it's the first user of **model**, else it will use a new thread.
|
||||
| ```setLogLevel(lvl: int)``` | Set Vosk's log level (default: ```0```: Info) <br>```-2```: Error<br>```-1```: Warning<br>```1```: Verbose<br>```2```: More verbose<br>```3```: Debug |
|
||||
| ```revokeURLs()``` | Revoke the Blob URLs of pthread worker and worklet processor |
|
||||
| ```cleanUp()``` | A convenience function that call ```revokeURLs()``` and ```delete()``` on all objects. You should put this at the end of your program! |
|
||||
|
||||
## ```Recognizer``` object
|
||||
| Function signature | Description |
|
||||
|---|---|
|
||||
| ```Promise<AudioWorkletNode> getNode(ctx: AudioContext, channelIndex = 0: int)``` | Get a pass-through node that recognize audio and is connectable to a processing graph. It has 1 input and 1 output, **channelIndex** must point to a 16-bit mono channel of the input |
|
||||
| ```recognize(buf: AudioBuffer, channelIndex = 0: int)``` | Recognize an AudioBuffer, usually from something like ```BaseAudioContext.decodeAudioData()```, **channelIndex** must point to a 16-bit mono channel of **buf**
|
||||
| ```setPartialWords(partialWords: bool)``` | Return words' information in a partialResult event (default: false) |
|
||||
| ```setWords(words: bool)``` | Return words' information in a result event (default: false) |
|
||||
| ```setNLSML(nlsml: bool)``` | Return result and partialResult in NLSML form (default: false) |
|
||||
| ```setMaxAlternatives(alts: int)``` | Set the max number of alternatives for result event (default: false) |
|
||||
| ```setGrm(grm: string)``` | Add grammar to the recognizer (default: none) |
|
||||
| ```setSpkModel(mdl: SpkModel)``` | Set the speaker model of the recognizer (default: none) |
|
||||
|
||||
| Event | Description |
|
||||
|---|---|
|
||||
| ```partialResult``` | There is a partial recognition result, check the event's "details" property |
|
||||
| ```result``` | There is a full recognition result, check the event's "details" property |
|
||||
|
||||
# Compilation
|
||||
Changing any option to non-default values requires recompilation
|
||||
```
|
||||
git clone --depth=1 https://github.com/msqr1/Browser-recognizer &&
|
||||
cd Browser-recognizer &&
|
||||
[Options] make
|
||||
```
|
||||
| Option | Description | Default value |
|
||||
|---|---|---|
|
||||
| MAX_MEMORY | Set max memory, valid suffixes: kb, mb, gb, tb or none (bytes) | ```300mb```, as [recommended](https://alphacephei.com/vosk/models) |
|
||||
| MAX_THREADS | Set the max number of thread (2 min) | ```2``` (1 OPFS thread + 1 model/recognizer thread) |
|
||||
| COMPILE_JOBS | Set the number of jobs (threads) when compiling | ```$(nproc)``` |
|
||||
| EMSDK | Set EMSDK's path (will install EMSDK in root folder if unset) | ```../emsdk``` |
|
||||
# Basic usage
|
||||
|
||||
|
||||
@@ -1,18 +1,22 @@
|
||||
#include "genericModel.h"
|
||||
|
||||
genericModel::genericModel(const std::string& storepath, const std::string &id, int index) : storepath(storepath), id(id), index(index) {
|
||||
if(!OPFSOk) {
|
||||
fireEv("_continue", "OPFS hasn't been initialized or not available", index);
|
||||
return;
|
||||
}
|
||||
fs::current_path("/opfs", tank);
|
||||
if(tank.value() != 0) {
|
||||
throwJS("Unable to cd OPFS root");
|
||||
fireEv("_continue","Unable to cd OPFS root",index);
|
||||
return;
|
||||
}
|
||||
fs::create_directories(storepath, tank);
|
||||
if(tank.value() != 0) {
|
||||
throwJS("Unable to create storepath");
|
||||
fireEv("_continue","Unable to create storepath", index);
|
||||
}
|
||||
fs::current_path(storepath, tank);
|
||||
if(tank.value() != 0) {
|
||||
throwJS("Unable to cd storepath");
|
||||
fireEv("_continue", "Unable to cd storepath", index);
|
||||
}
|
||||
}
|
||||
bool genericModel::checkModel() {
|
||||
@@ -27,7 +31,7 @@ bool genericModel::checkModel() {
|
||||
return id.compare(oldid) == 0 ? true : false;
|
||||
}
|
||||
void genericModel::afterFetch() {
|
||||
thrd.setTask1([this](){
|
||||
thrd.addTask([this](){
|
||||
if(!extractModel()) {
|
||||
fs::remove("/opfs/m0dEl.tar",tank);
|
||||
fs::current_path("/opfs", tank);
|
||||
@@ -39,8 +43,8 @@ void genericModel::afterFetch() {
|
||||
fs::remove("README",tank);
|
||||
std::ofstream idFile("id");
|
||||
if(!idFile.is_open()) {
|
||||
fs::current_path("/opfs");
|
||||
fs::remove_all(storepath);
|
||||
fs::current_path("/opfs", tank);
|
||||
fs::remove_all(storepath, tank);
|
||||
fireEv("_continue", "Unable to write model ID", index);
|
||||
return;
|
||||
}
|
||||
|
||||
@@ -14,7 +14,8 @@ namespace fs = std::filesystem;
|
||||
struct genericModel {
|
||||
const std::string storepath{};
|
||||
const std::string id{};
|
||||
twiceThrd thrd{};
|
||||
reusableThrd thrd{};
|
||||
bool recognizerUsedThrd{};
|
||||
int index{};
|
||||
static bool extractModel();
|
||||
virtual bool checkModelFiles() = 0;
|
||||
|
||||
@@ -1,13 +1,5 @@
|
||||
#include "global.h"
|
||||
void throwJS(const char* msg, bool err) {
|
||||
EM_ASM({
|
||||
if($1) {
|
||||
throw Error(UTF8ToString($0));
|
||||
return;
|
||||
}
|
||||
throw UTF8ToString($0);
|
||||
},msg, err);
|
||||
}
|
||||
|
||||
void fireEv(const char *type, const char *content, int index) {
|
||||
static ProxyingQueue pq{};
|
||||
auto proxy{[index, type, content](){
|
||||
@@ -23,22 +15,26 @@ void fireEv(const char *type, const char *content, int index) {
|
||||
}
|
||||
int main() {
|
||||
std::thread t{[](){
|
||||
wasmfs_create_directory("/opfs", 0777, wasmfs_create_opfs_backend());
|
||||
OPFSOk = (wasmfs_create_directory("/opfs", 0777, wasmfs_create_opfs_backend()) == 0 ? true : false);
|
||||
}};
|
||||
t.detach();
|
||||
emscripten_exit_with_live_runtime();
|
||||
}
|
||||
void twiceThrd::setTask1(std::function<void()> task1) {
|
||||
blocker.lock();
|
||||
std::thread t{[this, task1](){
|
||||
task1();
|
||||
blocker.lock();
|
||||
task2();
|
||||
ProxyingQueue reusableThrd::pq{};
|
||||
reusableThrd::reusableThrd() {
|
||||
thrd = std::thread{[this](){
|
||||
while(!done.test()) {
|
||||
static ProxyingQueue pq{};
|
||||
pq.execute();
|
||||
blocker.wait(done.test(), std::memory_order_relaxed);
|
||||
}
|
||||
}};
|
||||
t.detach();
|
||||
thrd.detach();
|
||||
}
|
||||
void twiceThrd::setTask2(std::function<void()> task2) {
|
||||
this->task2 = task2;
|
||||
blocker.unlock();
|
||||
reusable = false;
|
||||
void reusableThrd::addTask(std::function<void()> task) {
|
||||
pq.proxyAsync(thrd.native_handle(), std::move(task));
|
||||
}
|
||||
reusableThrd::~reusableThrd() {
|
||||
done.test_and_set(std::memory_order_relaxed);
|
||||
done.notify_one();
|
||||
}
|
||||
|
||||
16
src/global.h
16
src/global.h
@@ -10,14 +10,16 @@ using namespace emscripten;
|
||||
|
||||
static pthread_t selfTID{pthread_self()};
|
||||
static std::error_code tank{};
|
||||
void throwJS(const char* msg, bool err = false);
|
||||
static bool OPFSOk{};
|
||||
void fireEv(const char *type, const char *content, int index);
|
||||
int main();
|
||||
|
||||
struct twiceThrd { // A minimal std::thread wrapper to run exactly 2 tasks
|
||||
bool reusable{true};
|
||||
std::mutex blocker{};
|
||||
std::function<void()> task2{};
|
||||
void setTask1(std::function<void()> task1);
|
||||
void setTask2(std::function<void()> task2);
|
||||
struct reusableThrd { // A minimal std::thread wrapper to run exactly 2 tasks
|
||||
static ProxyingQueue pq;
|
||||
std::thread thrd;
|
||||
std::atomic_flag blocker{};
|
||||
std::atomic_flag done{};
|
||||
reusableThrd();
|
||||
void addTask(std::function<void()> task);
|
||||
~reusableThrd();
|
||||
};
|
||||
|
||||
@@ -23,7 +23,7 @@ void model::load(bool newThrd) {
|
||||
main();
|
||||
return;
|
||||
}
|
||||
thrd.setTask1(main);
|
||||
thrd.addTask(main);
|
||||
}
|
||||
bool model::checkModelFiles() {
|
||||
return fs::exists("am/final.mdl", tank) &&
|
||||
|
||||
35
src/pre.js
35
src/pre.js
@@ -11,11 +11,7 @@ Module.cleanUp = () => {
|
||||
class Recognizer extends EventTarget {
|
||||
constructor() {
|
||||
super()
|
||||
}
|
||||
_init(model, sampleRate) {
|
||||
this.obj = new Module.recognizer(model, sampleRate, objs.length)
|
||||
objs.push(this)
|
||||
this.ptr = Module._malloc(512)
|
||||
}
|
||||
async getNode(ctx, channelIndex = 0) {
|
||||
if(typeof this.node === "undefined") {
|
||||
@@ -59,9 +55,8 @@ class Recognizer extends EventTarget {
|
||||
}
|
||||
}
|
||||
class Model extends EventTarget {
|
||||
constructor(storepath, id) {
|
||||
constructor(d) {
|
||||
super()
|
||||
this.obj = new Module.model(storepath, id, objs.length)
|
||||
objs.push(this)
|
||||
}
|
||||
delete() {
|
||||
@@ -69,9 +64,8 @@ class Model extends EventTarget {
|
||||
}
|
||||
}
|
||||
class SpkModel extends EventTarget {
|
||||
constructor(storepath, id) {
|
||||
constructor() {
|
||||
super()
|
||||
this.obj = new Module.spkModel(storepath, id, objs.length)
|
||||
objs.push(this)
|
||||
}
|
||||
delete() {
|
||||
@@ -79,7 +73,7 @@ class SpkModel extends EventTarget {
|
||||
}
|
||||
}
|
||||
Module.makeModel = async (url, storepath, id) => {
|
||||
let mdl = new Model(storepath, id)
|
||||
let mdl = new Model()
|
||||
return new Promise((resolve, reject) => {
|
||||
mdl.addEventListener("_continue", (ev) => {
|
||||
if(ev.detail === ".") {
|
||||
@@ -88,6 +82,7 @@ Module.makeModel = async (url, storepath, id) => {
|
||||
mdl.delete()
|
||||
return reject(ev.detail)
|
||||
}, {once : true})
|
||||
mdl.obj = new Module.model(storepath, id, objs.length)
|
||||
if(mdl.obj.checkModel()) {
|
||||
mdl.obj.load(true)
|
||||
return;
|
||||
@@ -110,7 +105,7 @@ Module.makeModel = async (url, storepath, id) => {
|
||||
})
|
||||
}
|
||||
Module.makeSpkModel = async (url, storepath, id) => {
|
||||
let mdl = new SpkModel(storepath, id)
|
||||
let mdl = new SpkModel()
|
||||
return new Promise((resolve, reject) => {
|
||||
mdl.addEventListener("_continue", (ev) => {
|
||||
if(ev.detail === ".") {
|
||||
@@ -119,6 +114,7 @@ Module.makeSpkModel = async (url, storepath, id) => {
|
||||
mdl.delete()
|
||||
reject(ev.detail)
|
||||
}, {once : true})
|
||||
mdl.obj = new Module.model(storepath, id, objs.length)
|
||||
if(mdl.obj.checkModel()) {
|
||||
mdl.obj.load(true)
|
||||
return
|
||||
@@ -128,16 +124,21 @@ Module.makeSpkModel = async (url, storepath, id) => {
|
||||
if(!res.ok) {
|
||||
return reject("Unable to download model")
|
||||
}
|
||||
let arr = await res.arrayBuffer()
|
||||
let mdlMem = Module._malloc(arr.byteLength) // Will free in C++
|
||||
Module.HEAP8.set(new Int8Array(arr), mdlMem)
|
||||
mdl.obj.afterFetch(mdlMem, arr.byteLength)
|
||||
let wStream = await (await (await navigator.storage.getDirectory()).getFileHandle("m0dEl.tar", {create : true})).createWritable()
|
||||
let tarReader = res.body.pipeThrough(dStream).getReader()
|
||||
while(true) {
|
||||
let readRes = await tarReader.read()
|
||||
if(!readRes.done) await wStream.write(readRes.value)
|
||||
else break
|
||||
}
|
||||
await wStream.close()
|
||||
mdl.obj.afterFetch()
|
||||
})()
|
||||
})
|
||||
}
|
||||
Module.makeRecognizer = (model, sampleRate) => {
|
||||
let rec = new Recognizer()
|
||||
let retval = new Promise((resolve, reject) => {
|
||||
return new Promise((resolve, reject) => {
|
||||
rec.addEventListener("_continue", (ev) => {
|
||||
if(ev.detail == ".") {
|
||||
objs.push(rec)
|
||||
@@ -146,9 +147,9 @@ Module.makeRecognizer = (model, sampleRate) => {
|
||||
rec.delete()
|
||||
reject(ev.detail)
|
||||
}, {once : true})
|
||||
rec.obj = new Module.recognizer(model, sampleRate, objs.length)
|
||||
rec.ptr = Module._malloc(512)
|
||||
})
|
||||
rec._init(model.obj, sampleRate)
|
||||
return retval
|
||||
}
|
||||
let processorUrl = URL.createObjectURL(new Blob(['(',
|
||||
(() => {
|
||||
|
||||
@@ -1,5 +1,9 @@
|
||||
#include "recognizer.h"
|
||||
recognizer::recognizer(model* mdl, float sampleRate, int index) : index(index) {
|
||||
if(!OPFSOk) {
|
||||
fireEv("_continue", "OPFS hasn't been initialized or not available", index);
|
||||
return;
|
||||
}
|
||||
auto main{[this, mdl, sampleRate](){
|
||||
rec = vosk_recognizer_new(mdl->mdl,sampleRate);
|
||||
if(rec == nullptr) {
|
||||
@@ -21,8 +25,9 @@ recognizer::recognizer(model* mdl, float sampleRate, int index) : index(index) {
|
||||
}
|
||||
}
|
||||
}};
|
||||
if(mdl->thrd.reusable) {
|
||||
mdl->thrd.setTask2(main);
|
||||
if(mdl->recognizerUsedThrd) {
|
||||
mdl->thrd.addTask(main);
|
||||
mdl->recognizerUsedThrd = true;
|
||||
return;
|
||||
}
|
||||
std::thread t{main};
|
||||
|
||||
@@ -27,7 +27,7 @@ void spkModel::load(bool newThrd) {
|
||||
main();
|
||||
return;
|
||||
}
|
||||
thrd.setTask1(main);
|
||||
thrd.addTask(main);
|
||||
}
|
||||
bool spkModel::checkModelFiles() {
|
||||
return fs::exists("mfcc.conf", tank) &&
|
||||
|
||||
63
usage/README.md
Normal file
63
usage/README.md
Normal file
@@ -0,0 +1,63 @@
|
||||
# API interface
|
||||
## JS ```window``` object
|
||||
| Function signature | Description |
|
||||
|---|---|
|
||||
|```Promise<Module> loadBR()``` | Load Emscripten's Module |
|
||||
|
||||
## Shared interface
|
||||
| Function signature | Description |
|
||||
|---|---|
|
||||
| ```delete()``` | Delete this object, see [why](https://emscripten.org/docs/getting_started/FAQ.html#what-does-exiting-the-runtime-mean-why-don-t-atexit-s-run) this is neccessary.
|
||||
|
||||
## ```Module``` object
|
||||
| Function signature | Description |
|
||||
|---|---|
|
||||
| ```Promise<Model> makeModel(path: string, url: string, id: string)```<br><br>```Promise<SpkModel> makeSpkModel(path: string, url: string, id: string)``` | Make a ```Model``` or ```SpkModel```<br>- If **path** contains valid model files and **id** is the same, there will not be a fetch from **url**.<br>- If **path** doesn't contain valid model files, or if it contains valid model files but **id** is different, there will be a fetch from **url**, and the model is stored with **id**. Model files must be directly under the model root folder, and compressed model must be in .tgz format. |
|
||||
| ```Promise<Recognizer> makeRecognizer(model: Model, sampleRate: float)``` | Make a ```Recognizer```, it will use **model**'s thread if it's the first user of **model**, else it will use a new thread.
|
||||
| ```setLogLevel(lvl: int)``` | Set Vosk's log level (default: ```0```: Info) <br>```-2```: Error<br>```-1```: Warning<br>```1```: Verbose<br>```2```: More verbose<br>```3```: Debug |
|
||||
| ```revokeURLs()``` | Revoke the Blob URLs of pthread worker and worklet processor |
|
||||
| ```cleanUp()``` | A convenience function that call ```revokeURLs()``` and ```delete()``` on all objects. You should put this at the end of your program! |
|
||||
|
||||
## ```Recognizer``` object
|
||||
| Function signature | Description |
|
||||
|---|---|
|
||||
| ```Promise<AudioWorkletNode> getNode(ctx: AudioContext, channelIndex = 0: int)``` | Get a pass-through node that recognize audio and is connectable to a processing graph. It has 1 input and 1 output, **channelIndex** must point to a 16-bit mono channel of the input |
|
||||
| ```recognize(buf: AudioBuffer, channelIndex = 0: int)``` | Recognize an AudioBuffer, usually from something like ```BaseAudioContext.decodeAudioData()```, **channelIndex** must point to a 16-bit mono channel of **buf**
|
||||
| ```setPartialWords(partialWords: bool)``` | Return words' information in a partialResult event (default: false) |
|
||||
| ```setWords(words: bool)``` | Return words' information in a result event (default: false) |
|
||||
| ```setNLSML(nlsml: bool)``` | Return result and partialResult in NLSML form (default: false) |
|
||||
| ```setMaxAlternatives(alts: int)``` | Set the max number of alternatives for result event (default: false) |
|
||||
| ```setGrm(grm: string)``` | Add grammar to the recognizer (default: none) |
|
||||
| ```setSpkModel(mdl: SpkModel)``` | Set the speaker model of the recognizer (default: none) |
|
||||
|
||||
| Event | Description |
|
||||
|---|---|
|
||||
| ```partialResult``` | There is a partial recognition result, check the event's "details" property |
|
||||
| ```result``` | There is a full recognition result, check the event's "details" property |
|
||||
|
||||
# User agent notes
|
||||
## SharedArrayBuffer
|
||||
Browser-recognizer require SharedArrayBuffer to share thread's data, so these response headers must be set:
|
||||
- ***Cross-Origin-Embedder-Policy*** ---> ***require-corp***
|
||||
- ***Cross-Origin-Opener-Policy*** ---> ***same-origin***
|
||||
|
||||
If you can't set them, you may use a HACKY workaround at *src/addCOI.js*.
|
||||
|
||||
## Origin Private Filesystem (OPFS)
|
||||
Browser-recognizer needs the Emscripten WASMFS' OPFS to store its model, IDBFS was considered, but dropped because there is no direct way to read from IDBFS to C++ without copying to MEMFS (basically RAM). For safety with this, always:
|
||||
- Try catch ```window.loadBR()``` to to check for OPFS availability.
|
||||
- Check if there is enough space via ```navigator.storage.estimate()``` for TWICE THE MODEL SIZE before calling Module.makeModel
|
||||
|
||||
# Compilation
|
||||
Changing any option to non-default values requires recompilation
|
||||
```
|
||||
git clone --depth=1 https://github.com/msqr1/Browser-recognizer &&
|
||||
cd Browser-recognizer &&
|
||||
[Options] make
|
||||
```
|
||||
| Option | Description | Default value |
|
||||
|---|---|---|
|
||||
| MAX_MEMORY | Set max memory, valid suffixes: kb, mb, gb, tb or none (bytes) | ```300mb```, as [recommended](https://alphacephei.com/vosk/models) |
|
||||
| MAX_THREADS | Set the max number of thread (2 min) | ```2``` (1 OPFS thread + 1 model/recognizer thread) |
|
||||
| COMPILE_JOBS | Set the number of jobs (threads) when compiling | ```$(nproc)``` |
|
||||
| EMSDK | Set EMSDK's path (will install EMSDK in root folder if unset) | ```../emsdk``` |
|
||||
Reference in New Issue
Block a user