Remove single file deployment for streaming instantiation and smaller size.

This commit is contained in:
msqr1
2024-08-28 19:15:56 -07:00
parent 01068a2c6c
commit 0166c153bd
20 changed files with 87 additions and 86 deletions

3
.gitignore vendored
View File

@@ -5,4 +5,5 @@ clapack-wasm
openfst
emsdk
index.html
test.js
test.js
test.wasm

8
API.md
View File

@@ -9,7 +9,7 @@
|---|---|
| ```delete()``` | Delete this object (call C++ destructor), see [why](https://emscripten.org/docs/getting_started/FAQ.html#what-does-exiting-the-runtime-mean-why-don-t-atexit-s-run) this is neccessary. For recognizers, make sure they finished recognizing before deleting them |
## ```Module``` object
## ```Module``` object
| Function/Object | Description |
|---|---|
| ```Promise<Model> createModel(url: string, path: string, id: string)```<br><br>```Promise<SpkModel> createSpkModel(url: string, path: string, id: string)``` | Create a ```Model``` or ```SpkModel```, model files must be directly under the model root, and compressed model must be in ```.tar.gz```/```.tgz``` format. Tar format must be USTAR. If:<br>- ```path``` contains valid model files and ```id``` is the same, there will not be a fetch from ```url```.<br>- ```path``` doesn't contain valid model files, or if it contains valid model files but ```id``` is different, there will be a fetch from ```url```, and the model is stored with ```id```. Models are thread-safe and reusable across recognizers. |
@@ -24,7 +24,7 @@
|---|---|
| ```int findWord(word: string)``` | Check if a word can be recognized by the model, return the word symbol if ```word``` exists inside the model or ```-1``` otherwise. Word symbol ```0``` is for epsilon. |
## ```Recognizer``` object
## ```Recognizer``` object
| Function/Object | Description |
|---|---|
| ```acceptWaveform(audioData: Float32Array)``` | Accept voice data in a ```Float32Array``` with elements from ```-1.0``` to ```1.0```. |
@@ -35,7 +35,7 @@
| ```setGrm(grm: string)``` | Reconfigures recognizer to use grammar |
| ```setSpkModel(mdl: SpkModel)``` | Adds speaker model to already initialized recognizer |
| ```setEndpointerMode(mode: EpMode)``` | Set endpointer scaling factor (default: ```ANSWER_DEFAULT```) |
| ```setEndpointerDelays(tStartMax: float, tEnd: float, tMax: float)``` | Set endpointer delays |
| ```setEndpointerDelays(tStartMax: float, tEnd: float, tMax: float)``` | Set endpointer delays |
| Event | Description |
|---|---|
@@ -50,7 +50,7 @@ SharedArrayBuffer is necessary to share data between threads, so these response
If you can't set them, you may use a hacky workaround in *AddCOI.js*.
## CSP headers
Pthread worker construction must be from a blob (see [Emscripten issue](https://github.com/emscripten-core/emscripten/issues/21937)), so the CSP:
Pthread worker construction must be from a blob (see [Emscripten issue](https://github.com/emscripten-core/emscripten/issues/21937)), so the CSP:
- ```worker-src``` must include ```blob:```
## Model headers

File diff suppressed because one or more lines are too long

BIN
Examples/Vosklet.wasm Executable file

Binary file not shown.

View File

@@ -1,7 +1,7 @@
<!DOCTYPE html>
<html>
<head>
<script src="https://cdn.jsdelivr.net/gh/msqr1/Vosklet@1.1.1/examples/Vosklet.min.js" async defer></script>
<script src="https://cdn.jsdelivr.net/gh/msqr1/Vosklet@1.1.1/Examples/Vosklet.min.js" async defer></script>
<script>
async function start() {
// Make sure sample rate matches that in the training data

View File

@@ -1,7 +1,7 @@
<!DOCTYPE html>
<html>
<head>
<script src="https://cdn.jsdelivr.net/gh/msqr1/Vosklet@1.1.1/examples/Vosklet.min.js" async defer></script>
<script src="https://cdn.jsdelivr.net/gh/msqr1/Vosklet@1.1.1/Examples/Vosklet.min.js" async defer></script>
<script>
async function start() {
// Make sure sample rate matches that in the training data

View File

@@ -173,4 +173,4 @@
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
END OF TERMS AND CONDITIONS

3
NOTICE
View File

@@ -11,5 +11,4 @@ Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
limitations under the License.

View File

@@ -9,7 +9,7 @@
- Support multiple models
- Has models' storage path management
- Has models' ID management (for model updates)
- Has smaller JS size (>3.1MB vs 1.2MB gzipped)
- Has smaller JS size (>3.1MB vs 904KB gzipped)
- Has all related files (pthread worker, audio worklet processor,...) merged
- Has faster processing time
- Has shorter from-scratch build time
@@ -18,12 +18,12 @@
# Basic usage (microphone recognition in English)
- Result are logged to the console.
- Copied from *examples/fromMic.html*
- **Note: The example folder and this piece of code uses *examples/Vosklet.js* because I can't set the Response headers for my model for browsers to decompress correctly. Instead, I used DecompressionStream to decompress manually, so *examples/Vosklet.js* only works for the examples. In production, use the top-level Vosklet.js instead.**
- **Note: The example folder and this piece of code uses *Examples/Vosklet.js* because I can't set the Response headers for my model for browsers to decompress correctly. Instead, I used DecompressionStream to decompress manually, so *Examples/Vosklet.js* only works for the examples. In production, use the top-level Vosklet.js instead.**
```html
<!DOCTYPE html>
<html>
<head>
<script src="https://cdn.jsdelivr.net/gh/msqr1/Vosklet@1.1.1/examples/Vosklet.min.js" async defer></script>
<script src="https://cdn.jsdelivr.net/gh/msqr1/Vosklet@1.1.1/Examples/Vosklet.min.js" async defer></script>
<script>
async function start() {
// Make sure sample rate matches that in the training data
@@ -44,7 +44,7 @@
let module = await loadVosklet()
let model = await module.createModel("https://ccoreilly.github.io/vosk-browser/models/vosk-model-small-en-us-0.15.tar.gz","model","ID")
let recognizer = await module.createRecognizer(model, 16000)
// Listen for result and partial result
recognizer.addEventListener("result", ev => {
console.log("Result: ", ev.detail)

File diff suppressed because one or more lines are too long

BIN
Vosklet.wasm Executable file

Binary file not shown.

View File

@@ -6,7 +6,7 @@ using namespace emscripten;
EMSCRIPTEN_BINDINGS() {
function("setLogLevel", &vosk_set_log_level, allow_raw_pointers());
enum_<VoskEndpointerMode>("EpMode")
.value("ANSWER_DEFAULT", VOSK_EP_ANSWER_DEFAULT)
.value("ANSWER_SHORT", VOSK_EP_ANSWER_SHORT)
@@ -16,8 +16,8 @@ EMSCRIPTEN_BINDINGS() {
class_<CommonModel>("CommonModel")
.constructor<int, bool, std::string, std::string, int, int>(allow_raw_pointers())
.function("findWord", &CommonModel::findWord, allow_raw_pointers());
class_<Recognizer>("Recognizer")
class_<Recognizer>("Recognizer")
.constructor<int, float, CommonModel*>(allow_raw_pointers())
.constructor<int, float, CommonModel*, CommonModel*>(allow_raw_pointers())
.constructor<int, float, CommonModel*, std::string, int>(allow_raw_pointers())

View File

@@ -1,9 +1,9 @@
#include "CommonModel.h"
CommonModel::CommonModel(int index, bool normalMdl, std::string storepath, std::string id, int tarStart, int tarSize) :
normalMdl{normalMdl}, index{index},
storepath{std::move(storepath)},
id{std::move(id)}
CommonModel::CommonModel(int index, bool normalMdl, std::string storepath, std::string id, int tarStart, int tarSize) :
normalMdl{normalMdl}, index{index},
storepath{std::move(storepath)},
id{std::move(id)}
{
globalPool.exec([this, tarStart, tarSize]{
extractAndLoad(reinterpret_cast<unsigned char*>(tarStart), tarSize);
@@ -22,12 +22,12 @@ void CommonModel::extractAndLoad(unsigned char* tar, int tarSize) {
case FailedOpen:
fireEv(index, "Untar: Unable to open file for write");
return;
case FailedWrite:
case FailedWrite:
fireEv(index, "Untar: Unable to write file");
return;
case FailedClose:
fireEv(index, "Untar: Unable to close file after write");
return;
return;
};
if(normalMdl) mdl = vosk_model_new(storepath.c_str());
else mdl = vosk_spk_model_new(storepath.c_str());

View File

@@ -2,6 +2,7 @@
#include "Util.h"
#include <vosk_api.h>
struct CommonModel {
bool normalMdl;
int index;
@@ -12,5 +13,4 @@ struct CommonModel {
int findWord(std::string word);
CommonModel(int index, bool normalMdl, std::string storepath, std::string id, int tarStart, int tarSize);
~CommonModel();
};
};

View File

@@ -1,18 +1,18 @@
#include "Recognizer.h"
#include "Recognizer.h"
#include "emscripten/atomic.h"
Recognizer::Recognizer(int index, float sampleRate, CommonModel* model) :
index{index},
Recognizer::Recognizer(int index, float sampleRate, CommonModel* model) :
index{index},
rec{vosk_recognizer_new(std::get<VoskModel*>(model->mdl), sampleRate)}
{
finishConstruction(model);
}
Recognizer::Recognizer(int index, float sampleRate, CommonModel* model, CommonModel* spkModel) :
Recognizer::Recognizer(int index, float sampleRate, CommonModel* model, CommonModel* spkModel) :
index{index},
rec{vosk_recognizer_new_spk(std::get<VoskModel*>(model->mdl), sampleRate, std::get<VoskSpkModel*>(spkModel->mdl))} {
finishConstruction(model, spkModel);
}
Recognizer::Recognizer(int index, float sampleRate, CommonModel* model, const std::string& grm, int) :
index{index},
Recognizer::Recognizer(int index, float sampleRate, CommonModel* model, const std::string& grm, int) :
index{index},
rec{vosk_recognizer_new_grm(std::get<VoskModel*>(model->mdl), sampleRate, grm.c_str())} {
finishConstruction(model);
}

View File

@@ -13,7 +13,7 @@ int untar(unsigned char* tar, int tarSize, const std::string& storepath) {
path.reserve(100); // Max length
unsigned char* end = tar + tarSize;
while(tar <= end) {
if(tar[156] != '5' && tar[156] != 0 &&
if(tar[156] != '5' && tar[156] != 0 &&
tar[156] != '0') {
return IncorrectFiletype;
}
@@ -64,7 +64,7 @@ void Thread::startup(ThreadPool* pool) {
emscripten_atomic_store_u32(&pool->qLock, false);
emscripten_atomic_notify(&pool->qLock, 1);
fn();
}
}
}
ThreadPool::ThreadPool() {
for(Thread& thrd : threads) {

View File

@@ -8,7 +8,7 @@
#include <emscripten/atomic.h>
#include <emscripten/console.h>
namespace fs = std::filesystem;
struct AudioData {
float* data;
int len;

View File

@@ -1,4 +1,4 @@
let objs = []
let objs = []
let processorURL = URL.createObjectURL(new Blob(['(', (() => {
registerProcessor("VoskletTransferer", class extends AudioWorkletProcessor {
constructor(opts) {
@@ -28,8 +28,8 @@ Module.cleanUp = () => {
Module.createTransferer = async (ctx, bufferSize) => {
await ctx.audioWorklet.addModule(processorURL)
return new AudioWorkletNode(ctx, "VoskletTransferer", {
channelCountMode : "explicit",
return new AudioWorkletNode(ctx, "VoskletTransferer", {
channelCountMode : "explicit",
numberOfInputs : 1,
numberOfOutputs : 0,
channelCount : 1,
@@ -37,7 +37,7 @@ Module.createTransferer = async (ctx, bufferSize) => {
})
}
async function getFileHandle(path, create = false) {
getFileHandle = async (path, create = false) => {
let components = path.split("/")
let prevDir = await navigator.storage.getDirectory()
for(let component of components.slice(0, -1)) prevDir = await prevDir.getDirectoryHandle(component, { create : create })
@@ -65,7 +65,7 @@ class CommonModel extends EventTarget {
let dataFile = await (await getFileHandle(storepath + "/model.tgz")).getFile()
let idFile = await (await getFileHandle(storepath + "/id")).getFile()
if(await idFile.text() != id) throw ""
tar = await new Response(dataFile.stream().pipeThrough(new DecompressionStream("gzip"))).arrayBuffer()
tar = await new Response(dataFile.stream().pipeThrough(new DecompressionStream("gzip"))).arrayBuffer()
}
catch {
try {
@@ -105,7 +105,7 @@ class Recognizer extends EventTarget {
super()
objs.push(this)
return new Proxy(this, {
get(self, prop, receiver) {
get(self, prop, _) {
return self.obj && Object.keys(Object.getPrototypeOf(self.obj)).includes(prop) ? self.obj[prop].bind(self.obj) : self[prop] ? self[prop].bind ? self[prop].bind(self) : self[prop] : undefined
}
})
@@ -123,13 +123,13 @@ class Recognizer extends EventTarget {
rec.obj = new Module.Recognizer(objs.length - 1, sampleRate, model)
break
case 2:
rec.obj = new Module.Recognizer(objs.length -1, sampleRate, model, spkModel)
rec.obj = new Module.Recognizer(objs.length -1, sampleRate, model, spkModel)
break
default:
rec.obj = new Module.Recognizer(objs.length - 1, sampleRate, model, grammar, 0)
rec.obj = new Module.Recognizer(objs.length - 1, sampleRate, model, grammar, 0)
}
return result
}
}
acceptWaveform(audioData) {
let start = Module._malloc(audioData.length * 4)
Module.HEAPF32.set(audioData, start / 4)

View File

@@ -4,15 +4,15 @@ MAX_THREADS=${MAX_THREADS:-1}
EMSDK=${EMSDK:-../emsdk}
JOBS=${JOBS:-$(nproc)}
if [ $EMSDK != ../emsdk ] && [ ! -d $EMSDK ]; then
if [ "$EMSDK" != ../emsdk ] && [ ! -d "$EMSDK" ]; then
echo "Invalid emsdk path"
exit 1
fi
if [ $MAX_THREADS -lt 1 ]; then
fi
if [ "$MAX_THREADS" -lt 1 ]; then
echo "MAX_THREADS must be greater than or equal to 1"
exit 1
fi
if [ $JOBS -lt 1 ]; then
if [ "$JOBS" -lt 1 ]; then
echo "JOBS must be greater than or equal to 1"
exit 1
fi
@@ -20,15 +20,15 @@ if ! [[ $INITIAL_MEMORY =~ ^[0-9]+([kmgt]b)?$ ]]; then
echo "INITIAL_MEMORY valid suffixes are kb, mb, gb, tb, none (bytes)"
exit 1
fi
if [ $EMSDK = ../emsdk ] && [ ! -d $EMSDK ]; then
if [ "$EMSDK" = ../emsdk ] && [ ! -d "$EMSDK" ]; then
echo "Installing emsdk + Emscripten..."
git clone --depth=1 https://github.com/emscripten-core/emsdk.git ../emsdk &&
cd ../emsdk &&
./emsdk install 3.1.65 &&
./emsdk activate 3.1.65
fi
. $(realpath $EMSDK)/emsdk_env.sh &&
export PATH=:$PATH:$(realpath $EMSDK)/upstream/bin &&
. $(realpath "$EMSDK")/emsdk_env.sh &&
export PATH=:$PATH:$(realpath "$EMSDK")/upstream/bin &&
cd .. &&
SRC=$(realpath src)
@@ -37,47 +37,48 @@ VOSK=$(realpath vosk)
OPENFST=$(realpath openfst)
CLAPACK_WASM=$(realpath clapack-wasm)
if [ ! -d $OPENFST ]; then
if [ ! -d "$OPENFST" ]; then
rm -rf /tmp/openfst &&
git clone --depth=1 https://github.com/alphacep/openfst /tmp/openfst &&
cd /tmp/openfst &&
git apply $SRC/Openfst.patch
git apply "$SRC"/Openfst.patch
autoreconf -is &&
CXXFLAGS="-r -O3 -flto -msimd128 -mreference-types -mnontrapping-fptoint -mextended-const -msign-ext -mmutable-globals" LDFLAGS="-O3 -flto" emconfigure ./configure --prefix=$OPENFST --enable-static --disable-shared --enable-lookahead-fsts --enable-ngram-fsts --disable-bin &&
emmake make -j$JOBS install &&
echo "PACKAGE_VERSION = 1.8.0" >> $OPENFST/Makefile
CXXFLAGS="-r -O3 -flto -msimd128 -mreference-types -mnontrapping-fptoint -mextended-const -msign-ext -mmutable-globals" LDFLAGS="-O3 -flto" emconfigure ./configure --prefix="$OPENFST" --enable-static --disable-shared --enable-lookahead-fsts --enable-ngram-fsts --disable-bin &&
emmake make -j"$JOBS" install &&
echo "PACKAGE_VERSION = 1.8.0" >> "$OPENFST"/Makefile
fi
if [ ! -d $CLAPACK_WASM ]; then
git clone --depth=1 https://gitlab.inria.fr/multispeech/kaldi.web/clapack-wasm.git $CLAPACK_WASM &&
cd $CLAPACK_WASM &&
git apply $SRC/Clapack-wasm.patch &&
if [ ! -d "$CLAPACK_WASM" ]; then
git clone --depth=1 https://gitlab.inria.fr/multispeech/kaldi.web/clapack-wasm.git "$CLAPACK_WASM" &&
cd "$CLAPACK_WASM" &&
git apply "$SRC"/Clapack-wasm.patch &&
bash install_repo.sh emcc
fi
if [ ! -d $KALDI ]; then
git clone -b vosk --depth=1 https://github.com/alphacep/kaldi $KALDI &&
cd $KALDI/src &&
git apply $SRC/Kaldi.patch &&
CXXFLAGS="-O3 -UHAVE_EXECINFO_H -flto -msimd128 -mreference-types -mnontrapping-fptoint -mextended-const -msign-ext -mmutable-globals -Wno-unused-variable -Wno-unused-but-set-variable -g0" LDFLAGS="-O3 -lembind -flto -g0" emconfigure ./configure --use-cuda=no --with-cudadecoder=no --static --static-math=yes --static-fst=yes --debug-level=0 --fst-root=$OPENFST --clapack-root=$CLAPACK_WASM --host=WASM &&
emmake make -j$JOBS online2 rnnlm
if [ ! -d "$KALDI" ]; then
git clone -b vosk --depth=1 https://github.com/alphacep/kaldi "$KALDI" &&
cd "$KALDI"/src &&
git apply "$SRC"/Kaldi.patch &&
CXXFLAGS="-O3 -UHAVE_EXECINFO_H -flto -msimd128 -mreference-types -mnontrapping-fptoint -mextended-const -msign-ext -mmutable-globals -Wno-unused-variable -Wno-unused-but-set-variable -g0" LDFLAGS="-O3 -lembind -flto -g0" emconfigure ./configure --use-cuda=no --with-cudadecoder=no --static --static-math=yes --static-fst=yes --debug-level=0 --fst-root="$OPENFST" --clapack-root="$CLAPACK_WASM" --host=WASM &&
emmake make -j"$JOBS" online2 rnnlm
fi
if [ ! -d $VOSK ]; then
git clone -b v0.3.50 --depth=1 https://github.com/alphacep/vosk-api $VOSK &&
cd $VOSK/src &&
git apply $SRC/Vosk.patch &&
if [ ! -d "$VOSK" ]; then
git clone -b v0.3.50 --depth=1 https://github.com/alphacep/vosk-api "$VOSK" &&
cd "$VOSK"/src &&
git apply "$SRC"/Vosk.patch &&
VOSK_FILES="Recognizer.cc language_model.cc model.cc spk_model.cc vosk_api.cc" &&
em++ -O3 -flto -msimd128 -mreference-types -mnontrapping-fptoint -mextended-const -msign-ext -mmutable-globals -Wno-deprecated -I. -I$KALDI/src -I$OPENFST/include $VOSK_FILES -c &&
emar -rcs vosk.a ${VOSK_FILES//.cc/.o}
em++ -O3 -flto -msimd128 -mreference-types -mnontrapping-fptoint -mextended-const -msign-ext -mmutable-globals -Wno-deprecated -I. -I"$KALDI"/src -I"$OPENFST"/include "$VOSK_FILES" -c &&
emar -rcs vosk.a "${VOSK_FILES//.cc/.o}"
fi
cd $SRC &&
em++ Util.cc CommonModel.cc Recognizer.cc Bindings.cc -O3 -Wno-pthreads-mem-growth -DEMSCRIPTEN_HAS_UNBOUND_TYPE_NAMES=0 -fno-rtti -DMAX_THREADS=$MAX_THREADS -sWASMFS -sWASM_BIGINT -sSINGLE_FILE -sMODULARIZE -sEMBIND_STD_STRING_IS_UTF8 -sPTHREAD_POOL_DELAY_LOAD -sTEXTDECODER=2 -sPTHREAD_POOL_SIZE_STRICT=2 -sINITIAL_MEMORY=$INITIAL_MEMORY -sALLOW_MEMORY_GROWTH -sPTHREAD_POOL_SIZE=$MAX_THREADS -sPOLYFILL=0 -sEXIT_RUNTIME=0 -sINVOKE_RUN=0 -sSUPPORT_LONGJMP=0 -sALLOW_BLOCKING_ON_MAIN_THREAD=0 -sEXPORTED_FUNCTIONS=_malloc -sEXPORT_NAME=loadVosklet -sMALLOC=emmalloc -sEXPORTED_RUNTIME_METHODS=UTF8ToString,stringToUTF8OnStack -sENVIRONMENT=web,worker -I. -I$VOSK/src -L$KALDI/src -l:online2/kaldi-online2.a -l:decoder/kaldi-decoder.a -l:ivector/kaldi-ivector.a -l:gmm/kaldi-gmm.a -l:tree/kaldi-tree.a -l:feat/kaldi-feat.a -l:cudamatrix/kaldi-cudamatrix.a -l:lat/kaldi-lat.a -l:lm/kaldi-lm.a -l:rnnlm/kaldi-rnnlm.a -l:hmm/kaldi-hmm.a -l:nnet3/kaldi-nnet3.a -l:transform/kaldi-transform.a -l:matrix/kaldi-matrix.a -l:fstext/kaldi-fstext.a -l:util/kaldi-util.a -l:base/kaldi-base.a -L$OPENFST/lib -l:libfst.a -l:libfstngram.a -L$CLAPACK_WASM -l:CBLAS/lib/cblas.a -l:CLAPACK-3.2.1/lapack.a -l:CLAPACK-3.2.1/libcblaswr.a -l:f2c_BLAS-3.8.0/blas.a -l:libf2c/libf2c.a -L$VOSK/src -l:vosk.a -lembind -pthread -flto -msimd128 -mreference-types -mnontrapping-fptoint -mextended-const -msign-ext -mmutable-globals --pre-js Wrapper.js -o ../Vosklet.js &&
cd "$SRC" &&
em++ Util.cc CommonModel.cc Recognizer.cc Bindings.cc -O3 -Wno-pthreads-mem-growth -DEMSCRIPTEN_HAS_UNBOUND_TYPE_NAMES=0 -fno-rtti -DMAX_THREADS="$MAX_THREADS" -sWASMFS -sWASM_BIGINT -sMODULARIZE -sEMBIND_STD_STRING_IS_UTF8 -sPTHREAD_POOL_DELAY_LOAD -sTEXTDECODER=2 -sPTHREAD_POOL_SIZE_STRICT=2 -sINITIAL_MEMORY="$INITIAL_MEMORY" -sALLOW_MEMORY_GROWTH -sPTHREAD_POOL_SIZE="$MAX_THREADS" -sPOLYFILL=0 -sEXIT_RUNTIME=0 -sINVOKE_RUN=0 -sSUPPORT_LONGJMP=0 -sALLOW_BLOCKING_ON_MAIN_THREAD=0 -sEXPORTED_FUNCTIONS=_malloc -sEXPORT_NAME=loadVosklet -sMALLOC=emmalloc -sEXPORTED_RUNTIME_METHODS=UTF8ToString,stringToUTF8OnStack -sENVIRONMENT=web,worker -I. -I"$VOSK"/src -L"$KALDI"/src -l:online2/kaldi-online2.a -l:decoder/kaldi-decoder.a -l:ivector/kaldi-ivector.a -l:gmm/kaldi-gmm.a -l:tree/kaldi-tree.a -l:feat/kaldi-feat.a -l:cudamatrix/kaldi-cudamatrix.a -l:lat/kaldi-lat.a -l:lm/kaldi-lm.a -l:rnnlm/kaldi-rnnlm.a -l:hmm/kaldi-hmm.a -l:nnet3/kaldi-nnet3.a -l:transform/kaldi-transform.a -l:matrix/kaldi-matrix.a -l:fstext/kaldi-fstext.a -l:util/kaldi-util.a -l:base/kaldi-base.a -L"$OPENFST"/lib -l:libfst.a -l:libfstngram.a -L"$CLAPACK_WASM" -l:CBLAS/lib/cblas.a -l:CLAPACK-3.2.1/lapack.a -l:CLAPACK-3.2.1/libcblaswr.a -l:f2c_BLAS-3.8.0/blas.a -l:libf2c/libf2c.a -L"$VOSK"/src -l:vosk.a -lembind -pthread -flto -msimd128 -mreference-types -mnontrapping-fptoint -mextended-const -msign-ext -mmutable-globals --pre-js Wrapper.js -o ../Vosklet.js &&
cd .. &&
rm -f Vosklet.worker.js
cp Vosklet.js Examples/Vosklet.js &&
cp Vosklet.wasm Example/Vosklet.wasm &&
# Can't serve files from raw.githubusercontent with Content-Encoding: gzip header so the browser won't decompress automatically. Manually decompressing instead.
sed -i 's/res.body/new Response(res.body.pipeThrough(new DecompressionStream("gzip"))).body/' Examples/Vosklet.js &&

22
test
View File

@@ -4,15 +4,15 @@ MAX_THREADS=${MAX_THREADS:-1}
EMSDK=${EMSDK:-emsdk}
JOBS=${JOBS:-$(nproc)}
if [ $EMSDK != emsdk ] && [ ! -d $EMSDK ]; then
if [ "$EMSDK" != emsdk ] && [ ! -d "$EMSDK" ]; then
echo "Invalid emsdk path"
exit 1
fi
if [ $MAX_THREADS -lt 1 ]; then
fi
if [ "$MAX_THREADS" -lt 1 ]; then
echo "MAX_THREADS must be greater than or equal to 1"
exit 1
fi
if [ $JOBS -lt 1 ]; then
if [ "$JOBS" -lt 1 ]; then
echo "JOBS must be greater than or equal to 1"
exit 1
fi
@@ -20,15 +20,15 @@ if ! [[ $INITIAL_MEMORY =~ ^[0-9]+([kmgt]b)?$ ]]; then
echo "INITIAL_MEMORY valid suffixes are kb, mb, gb, tb, none (bytes)"
exit 1
fi
if [ $EMSDK = emsdk ] && [ ! -d $EMSDK ]; then
if [ "$EMSDK" = emsdk ] && [ ! -d "$EMSDK" ]; then
echo "Installing emsdk + Emscripten..."
git clone --depth=1 https://github.com/emscripten-core/emsdk.git ../emsdk &&
cd ../emsdk &&
./emsdk install 3.1.65 &&
./emsdk activate 3.1.65
fi
. $(realpath $EMSDK)/emsdk_env.sh &&
export PATH=:$PATH:$(realpath $EMSDK)/upstream/bin
. $(realpath "$EMSDK")/emsdk_env.sh &&
export PATH=:$PATH:$(realpath "$EMSDK")/upstream/bin
KALDI=$(realpath kaldi)
VOSK=$(realpath vosk)
@@ -38,10 +38,10 @@ CLAPACK_WASM=$(realpath clapack-wasm)
cd src &&
MODE=1 && # 0: Ultra debug info, 1: Optimized release, else custom
echo "Mode = $MODE" &&
if [ $MODE = 0 ]; then
em++ Util.cc CommonModel.cc Recognizer.cc Bindings.cc -O0 -Wno-pthreads-mem-growth -DEMSCRIPTEN_HAS_UNBOUND_TYPE_NAMES=0 -fno-rtti -DMAX_THREADS=$MAX_THREADS -Wall -Werror -Wno-pthreads-mem-growth -sWASMFS -sWASM_BIGINT -sSINGLE_FILE -sMODULARIZE -sEMBIND_STD_STRING_IS_UTF8 -sPTHREAD_POOL_DELAY_LOAD -sRUNTIME_DEBUG -sALLOW_MEMORY_GROWTH -sSTACK_OVERFLOW_CHECK=2 -sTEXTDECODER=2 -sPTHREAD_POOL_SIZE_STRICT=2 -sASSERTIONS=2 -sINITIAL_MEMORY=$INITIAL_MEMORY -sPTHREAD_POOL_SIZE=$MAX_THREADS -sDISABLE_EXCEPTION_CATCHING=0 -sEXIT_RUNTIME=0 -sINVOKE_RUN=0 -sPOLYFILL=0 -sALLOW_BLOCKING_ON_MAIN_THREAD=0 -sEXPORTED_FUNCTIONS=_malloc -sEXPORT_NAME=loadVosklet -sMALLOC=emmalloc -sEXPORTED_RUNTIME_METHODS=UTF8ToString,stringToUTF8OnStack -sENVIRONMENT=web,worker -I. -I$VOSK/src -L$KALDI/src -l:online2/kaldi-online2.a -l:decoder/kaldi-decoder.a -l:ivector/kaldi-ivector.a -l:gmm/kaldi-gmm.a -l:tree/kaldi-tree.a -l:feat/kaldi-feat.a -l:cudamatrix/kaldi-cudamatrix.a -l:lat/kaldi-lat.a -l:lm/kaldi-lm.a -l:rnnlm/kaldi-rnnlm.a -l:hmm/kaldi-hmm.a -l:nnet3/kaldi-nnet3.a -l:transform/kaldi-transform.a -l:matrix/kaldi-matrix.a -l:fstext/kaldi-fstext.a -l:util/kaldi-util.a -l:base/kaldi-base.a -L$OPENFST/lib -l:libfst.a -l:libfstngram.a -L$CLAPACK_WASM -l:CBLAS/lib/cblas.a -l:CLAPACK-3.2.1/lapack.a -l:CLAPACK-3.2.1/libcblaswr.a -l:f2c_BLAS-3.8.0/blas.a -l:libf2c/libf2c.a -L$VOSK/src -l:vosk.a -lembind -pthread -flto -fsanitize=undefined -fsanitize=address -fsanitize=leak -msimd128 -mreference-types -mnontrapping-fptoint -mextended-const -msign-ext -mmutable-globals -g3 --pre-js Wrapper.js -o ../test.js
elif [ $MODE = 1 ]; then
em++ Util.cc CommonModel.cc Recognizer.cc Bindings.cc -O3 -Wno-pthreads-mem-growth -DEMSCRIPTEN_HAS_UNBOUND_TYPE_NAMES=0 -fno-rtti -DMAX_THREADS=$MAX_THREADS -sWASMFS -sWASM_BIGINT -sSINGLE_FILE -sMODULARIZE -sEMBIND_STD_STRING_IS_UTF8 -sPTHREAD_POOL_DELAY_LOAD -sTEXTDECODER=2 -sPTHREAD_POOL_SIZE_STRICT=2 -sINITIAL_MEMORY=$INITIAL_MEMORY -sALLOW_MEMORY_GROWTH -sPTHREAD_POOL_SIZE=$MAX_THREADS -sPOLYFILL=0 -sEXIT_RUNTIME=0 -sINVOKE_RUN=0 -sSUPPORT_LONGJMP=0 -sALLOW_BLOCKING_ON_MAIN_THREAD=0 -sEXPORTED_FUNCTIONS=_malloc -sEXPORT_NAME=loadVosklet -sMALLOC=emmalloc -sEXPORTED_RUNTIME_METHODS=UTF8ToString,stringToUTF8OnStack -sENVIRONMENT=web,worker -I. -I$VOSK/src -L$KALDI/src -l:online2/kaldi-online2.a -l:decoder/kaldi-decoder.a -l:ivector/kaldi-ivector.a -l:gmm/kaldi-gmm.a -l:tree/kaldi-tree.a -l:feat/kaldi-feat.a -l:cudamatrix/kaldi-cudamatrix.a -l:lat/kaldi-lat.a -l:lm/kaldi-lm.a -l:rnnlm/kaldi-rnnlm.a -l:hmm/kaldi-hmm.a -l:nnet3/kaldi-nnet3.a -l:transform/kaldi-transform.a -l:matrix/kaldi-matrix.a -l:fstext/kaldi-fstext.a -l:util/kaldi-util.a -l:base/kaldi-base.a -L$OPENFST/lib -l:libfst.a -l:libfstngram.a -L$CLAPACK_WASM -l:CBLAS/lib/cblas.a -l:CLAPACK-3.2.1/lapack.a -l:CLAPACK-3.2.1/libcblaswr.a -l:f2c_BLAS-3.8.0/blas.a -l:libf2c/libf2c.a -L$VOSK/src -l:vosk.a -lembind -pthread -flto -msimd128 -mreference-types -mnontrapping-fptoint -mextended-const -msign-ext -mmutable-globals --pre-js Wrapper.js -o ../test.js
if [ "$MODE" = 0 ]; then
em++ Util.cc CommonModel.cc Recognizer.cc Bindings.cc -O0 -Wno-pthreads-mem-growth -DEMSCRIPTEN_HAS_UNBOUND_TYPE_NAMES=0 -fno-rtti -DMAX_THREADS="$MAX_THREADS" -Wall -Werror -Wno-pthreads-mem-growth -sWASMFS -sWASM_BIGINT -sMODULARIZE -sEMBIND_STD_STRING_IS_UTF8 -sPTHREAD_POOL_DELAY_LOAD -sRUNTIME_DEBUG -sALLOW_MEMORY_GROWTH -sSTACK_OVERFLOW_CHECK=2 -sTEXTDECODER=2 -sPTHREAD_POOL_SIZE_STRICT=2 -sASSERTIONS=2 -sINITIAL_MEMORY="$INITIAL_MEMORY" -sPTHREAD_POOL_SIZE="$MAX_THREADS" -sDISABLE_EXCEPTION_CATCHING=0 -sEXIT_RUNTIME=0 -sINVOKE_RUN=0 -sPOLYFILL=0 -sALLOW_BLOCKING_ON_MAIN_THREAD=0 -sEXPORTED_FUNCTIONS=_malloc -sEXPORT_NAME=loadVosklet -sMALLOC=emmalloc -sEXPORTED_RUNTIME_METHODS=UTF8ToString,stringToUTF8OnStack -sENVIRONMENT=web,worker -I. -I"$VOSK"/src -L"$KALDI"/src -l:online2/kaldi-online2.a -l:decoder/kaldi-decoder.a -l:ivector/kaldi-ivector.a -l:gmm/kaldi-gmm.a -l:tree/kaldi-tree.a -l:feat/kaldi-feat.a -l:cudamatrix/kaldi-cudamatrix.a -l:lat/kaldi-lat.a -l:lm/kaldi-lm.a -l:rnnlm/kaldi-rnnlm.a -l:hmm/kaldi-hmm.a -l:nnet3/kaldi-nnet3.a -l:transform/kaldi-transform.a -l:matrix/kaldi-matrix.a -l:fstext/kaldi-fstext.a -l:util/kaldi-util.a -l:base/kaldi-base.a -L"$OPENFST"/lib -l:libfst.a -l:libfstngram.a -L"$CLAPACK_WASM" -l:CBLAS/lib/cblas.a -l:CLAPACK-3.2.1/lapack.a -l:CLAPACK-3.2.1/libcblaswr.a -l:f2c_BLAS-3.8.0/blas.a -l:libf2c/libf2c.a -L"$VOSK"/src -l:vosk.a -lembind -pthread -flto -fsanitize=undefined -fsanitize=address -fsanitize=leak -msimd128 -mreference-types -mnontrapping-fptoint -mextended-const -msign-ext -mmutable-globals -g3 --pre-js Wrapper.js -o ../test.js
elif [ "$MODE" = 1 ]; then
em++ Util.cc CommonModel.cc Recognizer.cc Bindings.cc -O3 -Wno-pthreads-mem-growth -DEMSCRIPTEN_HAS_UNBOUND_TYPE_NAMES=0 -fno-rtti -DMAX_THREADS="$MAX_THREADS" -sWASMFS -sWASM_BIGINT -sMODULARIZE -sEMBIND_STD_STRING_IS_UTF8 -sPTHREAD_POOL_DELAY_LOAD -sTEXTDECODER=2 -sPTHREAD_POOL_SIZE_STRICT=2 -sINITIAL_MEMORY="$INITIAL_MEMORY" -sALLOW_MEMORY_GROWTH -sPTHREAD_POOL_SIZE="$MAX_THREADS" -sPOLYFILL=0 -sEXIT_RUNTIME=0 -sINVOKE_RUN=0 -sSUPPORT_LONGJMP=0 -sALLOW_BLOCKING_ON_MAIN_THREAD=0 -sEXPORTED_FUNCTIONS=_malloc -sEXPORT_NAME=loadVosklet -sMALLOC=emmalloc -sEXPORTED_RUNTIME_METHODS=UTF8ToString,stringToUTF8OnStack -sENVIRONMENT=web,worker -I. -I"$VOSK"/src -L"$KALDI"/src -l:online2/kaldi-online2.a -l:decoder/kaldi-decoder.a -l:ivector/kaldi-ivector.a -l:gmm/kaldi-gmm.a -l:tree/kaldi-tree.a -l:feat/kaldi-feat.a -l:cudamatrix/kaldi-cudamatrix.a -l:lat/kaldi-lat.a -l:lm/kaldi-lm.a -l:rnnlm/kaldi-rnnlm.a -l:hmm/kaldi-hmm.a -l:nnet3/kaldi-nnet3.a -l:transform/kaldi-transform.a -l:matrix/kaldi-matrix.a -l:fstext/kaldi-fstext.a -l:util/kaldi-util.a -l:base/kaldi-base.a -L"$OPENFST"/lib -l:libfst.a -l:libfstngram.a -L"$CLAPACK_WASM" -l:CBLAS/lib/cblas.a -l:CLAPACK-3.2.1/lapack.a -l:CLAPACK-3.2.1/libcblaswr.a -l:f2c_BLAS-3.8.0/blas.a -l:libf2c/libf2c.a -L"$VOSK"/src -l:vosk.a -lembind -pthread -flto -msimd128 -mreference-types -mnontrapping-fptoint -mextended-const -msign-ext -mmutable-globals --pre-js Wrapper.js -o ../test.js
else
:
fi