# #6 WebAssembly and C++: Bridging native code and asynchronous JavaScript This post is part of a [WebAssembly series](/tags/wasmcpp) focused on WASM and C++. The goal is to gain a thorough understanding of how WebAssembly works, how to use it as a compilation target for C++ code and hopefully have fun along the way. So, stick with me for this exciting journey. ## Recap In the previous post, I've compiled Lua interpreter to WASM using emscripten and successfully run it using node. There was a problem running the same code in the browser as blocking IO is not possible. Today I'm gonna try to address this issue and run Lua interpreter in the browser. There are links to compiled WASM demos in this post so, you can test the code yourself. ## Problem definition Lua interpreter is a REPL running in a tight `while` loop which is blocked on a call to `fgets` most of the time - synchronously waiting for input. Looking at the source code, there's a `pushline` function: ```C static int pushline (lua_State *L, int firstline) { char buffer[LUA_MAXINPUT]; char *b = buffer; size_t l; const char *prmt = get_prompt(L, firstline); int readstatus = lua_readline(L, b, prmt); if (readstatus == 0) return 0; /* no input (prompt will be popped by caller) */ ... return 1; } ``` This function is called in `loadline` and the latter is called in a tight while loop: ```C static void doREPL (lua_State *L) { int status; const char *oldprogname = progname; progname = NULL; /* no 'progname' on errors in interactive mode */ lua_initreadline(L); while ((status = loadline(L)) != -1) { if (status == LUA_OK) status = docall(L, 0, LUA_MULTRET); if (status == LUA_OK) l_print(L); else report(L, status); } lua_settop(L, 0); /* clear stack */ lua_writeline(); progname = oldprogname; } ``` `lua_readline` is a macro which results in a call to `fgets` or `readline` - depending on the platform. **To port Lua to `WASM` and be able to run it in the browser, this synchronous wait has to be replaced with either polling or an asynchronous approach.** ## Limitations When working with WASM and JavaScript, the most fundamental principle is that **we can't block JavaScript's thread**. WASM functions can't run busy loops. Cooperative scheduling has to be implemented in WASM module, which means that when you want to wait for something, the control has to be relinquished back to JavaScript. ## Experiments Let's put Lua to a side for a moment and focus on the mechanics and interoperability between WASM and JavaScript to better formulate the approach. I'm gonna start with a toy project containing a model of Lua's REPL. Once I'm able to run it in the browser, the approach to porting Lua (or, in fact, anything else) will be obvious. ```C++ #include #include int main(int argc, const char *argv[]) { printf("I'm an echo\n"); constexpr std::size_t MAXBUF = 1024; char buffer[MAXBUF]; while (true) { fgets(buffer, MAXBUF, stdin); printf("Echo: %s\n", buffer); } return 0; } ``` The above is a simplified model of a REPL. There's blocking wait on `fgets` for a new line of input, the "processing", which in this case is just print back of the input. I'm gonna define a simple `Makefile` to help build this code: ```Makefile .PHONY: clean all: repl.js repl.js: repl.cpp $(CXX) $(CXXFLAGS) -o $@ $< -sWASM=1 clean: rm -fv repl.js repl.wasm ``` With the `Makefile` in hand, it's possible to compile `repl.cpp` to WASM with just ```console $ emmake make ``` Let's create a simple `index.html` file to load the module. ```html

REPL

Output:

Input:

``` Don't mind the rudimentary styling, it's just to make the text fields bigger. ![browser](/wasm/wasm_cpp_06/app_model.png) Pointing the browser to this document (remember to use http server) greets us with an input prompt - that's the default implementation emscripten provides for `fgets`. After hitting cancel, the page becomes unresponsive and it's visible in the JavaScript console that we're just spinning indefinitely in the `while` loop. ![prompt](/wasm/wasm_cpp_06/input_prompt.png) ![browser busy](/wasm/wasm_cpp_06/busy_loop.png) emscripten wraps the [`requestAnimationFrame`](https://developer.mozilla.org/en-US/docs/Web/API/window/requestAnimationFrame) API to provide a way to break busy loops like this. It's called [`emscripten_set_main_loop`](https://emscripten.org/docs/api_reference/emscripten.h.html#c.emscripten_set_main_loop). Let's use that in the WASM module. ```C++ #include #include #ifdef __EMSCRIPTEN__ #include #endif void doRepl() { constexpr std::size_t MAXBUF = 1024; char buffer[MAXBUF]; fgets(buffer, MAXBUF, stdin); printf("Echo: %s\n", buffer); } int main(int argc, const char *argv[]) { printf("I'm an echo\n"); #ifdef __EMSCRIPTEN__ int fps = 30; int simulate_infinite_loop = 1; emscripten_set_main_loop(doRepl, fps, simulate_infinite_loop); #else while (true) { doRepl(); } #endif return 0; } ``` No more busy loops for WASM target! After reloading the page, the browser no longer hangs. Now comes the implementation of `fgets`. ### Polling First, let's add a callback to input `textarea`. ```JavaScript const input_element = document.getElementById('input'); input_element.addEventListener('keypress', (e) => { if (e.key === 'Enter') { e.preventDefault(); Module.pending_input.push('\n'.charCodeAt(0)); input_element.value = ''; } else { Module.pending_input.push(e.key.charCodeAt(0)); } }); ``` I'm gonna save the above as `index.js`. I'm referring to `Module` here. This is a global which I'm gonna create in `module.js`: ```JavaScript var Module = { pending_input : [], }; ``` This extends the environment for WASM module since, the JavaScript wrapper that emscripten generates, picks up the `Module` if it already exists: ```JavaScript ... var Module = typeof Module != 'undefined' ? Module : {}; ... ``` The last thing is to include all these scripts in the HTML file. ```HTML ... ... ``` A quick test reveals that the character collection works as visible below. ![character collection](/wasm/wasm_cpp_06/pending_input.png) Now, it's time for `fgets` implementation in C++. ```C++ #ifdef __EMSCRIPTEN__ EM_JS(int, js_getchar, (), { if (Module.pending_input.length == 0) { return 0; } return Module.pending_input.shift(); }); char *em_fgets(char *buf, std::size_t size, FILE *stream) { while (true) { if (int c = js_getchar()) { if (c == '\n') { *buf = '\0'; return buf; } else { if (size == 1) { *buf = '\0'; return buf; } *buf++ = c; size--; } continue; } emscripten_sleep(100); } return NULL; } #endif ``` First, a cool feature that emscripten offers, `EM_JS` macro generates an `extern` symbol in C++ and automatically adds the JavaScript function to the WASM environment. I'm using it to create the other end of the input FIFO - `js_getchar` drains `pending_input` array if there's any characters available. `em_fgets` is just a simple loop that glues the characters together to assemble the string. The important bit is a call to `emscripten_sleep` - this yields the control back to JavaScript so, in fact, the loop is not a tight locked loop but broken every iteration with a call to `emscripten_sleep`. This is cool as to C++ it looks like a synchronous call while it's actually a form of coroutine. Small modifications to the `doRepl` are required as well: ```C++ void doRepl() { ... #ifdef __EMSCRIPTEN__ p = em_fgets(buffer, MAXBUF, stdin); #else p = fgets(buffer, MAXBUF, stdin); #endif ... } ``` There's one more thing. Instead of printing to JavaScript console, it would be nice to append output to the `textarea` that I specifically created for that purpose. To do that, I'm gonna implement another override in the `Module` object. ```JavaScript var Module = { ... print : (text) => { const output_element = document.getElementById('output'); output_element.textContent += text + '\n'; }, }; ``` It's time to test the code. After recompiling and reloading everything, it's visible that the REPL is working. WASM is polling for input every 100ms. ![repl with polling](/wasm/wasm_cpp_06/polling.png) ### async Polling is a viable option but it's not preferred. It's suboptimal, introduces input lag and unnecessary overhead. It's better to use a fully asynchronous approach instead. I'm gonna reimplement `fgets` one more time but now, it's gonna be an asynchronous JavaScript function. This is possible thanks to `ASYNCIFY`. Within `repl.cpp` I'm gonna define `em_fgets` the following way. ```C++ EM_ASYNC_JS(char *, em_fgets, (const char* buf, size_t bufsize), { return await new Promise((resolve, reject) => { if (Module.pending_lines.length > 0) { resolve(Module.pending_lines.shift()); } else { Module.pending_fgets.push(resolve); } }).then((s) => { // convert JS string to WASM string let l = s.length + 1; if (l >= bufsize) { // truncate l = bufsize - 1; } Module.stringToUTF8(s.slice(0, l), buf, l); return buf; }); }); ``` `em_fgets` will be blocked, waiting for a Promise. This promise is only gonna be completed if a full line of text is collected from input. This line might be available straight away in `Module.pending_lines` or we might have to wait for it. In case of the latter, the function that resolves the promise is pushed to `Module.pending_fgets` array. Additionally, I have a continuation on the string value. The JavaScript string has to be copied to WASM memory. This is something I've already discussed in [part #3 of this series]({{< relref "/posts/wasm_cpp_03.md" >}}); thankfully, emscripten provides a function to perform that conversion for us. You might've noticed that I've dropped the `FILE*` parameter from the function signature. That's just to simplify the code as for the purpose of this use case it's completely superfluous. To make the `em_fgets` work, input collection code has to be modified as well ```JavaScript const input_element = document.getElementById('input'); input_element.addEventListener('keypress', function(e) { const isEnter = e.key === 'Enter'; if (isEnter) { e.preventDefault(); Module.pending_lines.push(Module.pending_chars.join('')); Module.pending_chars = []; input_element.value = ''; } else { Module.pending_chars.push(e.key); } if (Module.pending_fgets.length > 0 && Module.pending_lines.length > 0) { let resolver = Module.pending_fgets.shift(); resolver(Module.pending_lines.shift()); } }); ``` This code just collects the characters in `pending_chars`. Once it sees a newline it flushes the `pending_chars` array as a string to `pending_lines`. If there's data in `pending_lines` and there is at least one Promise resolver in `pending_fgets`, it will be called with the collected input. Of course, the additional arrays (`pending_lines`, `pending_fgets`) have to be added to the global `Module` definition. There's a small change to the Makefile required as well. `stringToUTF8` has to be explicitly exposed to be visible to WASM: ```Makefile .PHONY: clean all: repl.js repl.js: repl.cpp $(CXX) $(CXXFLAGS) -sWASM=1 -sASYNCIFY -sEXPORTED_RUNTIME_METHODS=stringToUTF8 -o $@ $< ``` That's it! Discussed example code can be found [here](https://gitlab.com/twdev_projects/wasm_async_experiments). Additionally, [live demo is available as well](https://wasm-async-experiments-twdev-projects-08da1525ad78ab970e0709b2c.gitlab.io/). ## Back to Lua Right! To make things a bit more convenient I'm gonna switch to Lua's git repo. I'm gonna work on `tw/wasm` branch. The plan is to replace `readline` with code implemented in JavaScript - as previously with `fgets`. Changes in `makefile` are minimal. I've removed the compiler being hardcoded to `gcc` and basically just added required emscripten defines - nothing more than that. ```Makefile --- a/makefile +++ b/makefile @@ -72,13 +72,12 @@ LOCAL = $(TESTS) $(CWARNS) # enable Linux goodies -MYLIBS= -ldl -lreadline +MYLIBS= -ldl -sASYNCIFY -sEXPORTED_RUNTIME_METHODS=stringToUTF8 -CC= gcc -CFLAGS= -Wall -O2 $(MYCFLAGS) -fno-stack-protector -fno-common -march=native -AR= ar rc -RANLIB= ranlib +CFLAGS= -Wall -O2 $(MYCFLAGS) -fno-stack-protector -fno-common +AR= emar rc +RANLIB= emranlib RM= rm -f @@ -96,7 +95,7 @@ AUX_O= lauxlib.o LIB_O= lbaselib.o ldblib.o liolib.o lmathlib.o loslib.o ltablib.o lstrlib.o \ lutf8lib.o loadlib.o lcorolib.o linit.o -LUA_T= lua +LUA_T= lua.js ``` Changes in `lua.c` are limited as well ```C++ +#ifdef __EMSCRIPTEN__ +#include + +EM_ASYNC_JS(char *, em_fgets, (const char* buf, size_t bufsize), { + return await new Promise((resolve, reject) => { + if (Module.pending_lines.length > 0) { + resolve(Module.pending_lines.shift()); + } else { + Module.pending_fgets.push(resolve); + } + }).then((s) => { + // convert JS string to WASM string + let l = s.length + 1; + if (l >= bufsize) { + // truncate + l = bufsize - 1; + } + Module.stringToUTF8(s.slice(0, l), buf, l); + return buf; + }); +}); + +static char* readline(const char* prompt) { + char* buf = malloc(LUA_MAXINPUT); + em_fgets(buf, LUA_MAXINPUT); + return buf; +} + +#define lua_initreadline(L) ((void)L) +#define lua_readline(L,b,p) ((void)L, ((b)=readline(p)) != NULL) +#define lua_saveline(L,line) ((void)L) +#define lua_freeline(L,b) ((void)L, free(b)) + +#else #include #include + #define lua_initreadline(L) ((void)L, rl_readline_name="lua") #define lua_readline(L,b,p) ((void)L, ((b)=readline(p)) != NULL) #define lua_saveline(L,line) ((void)L, add_history(line)) #define lua_freeline(L,b) ((void)L, free(b)) +#endif + ``` This is a one-to-one copy of the function I've already implemented in the toy project. With all of that in place, it's possible to just emmake make That's it! The supporting files that I've used in the toy project can be used without any additional changes to run the application. ![lua repl](/wasm/wasm_cpp_06/lua_repl.png) Fork of lua repo is available [here](https://gitlab.com/twdev_projects/lua_wasm/-/tree/tw/wasm?ref_type=heads). Working [demo is available here](https://lua-wasm-twdev-projects-db592518aeeb33359efee32b83647c311c2aa92.gitlab.io/). ## Integration with xterm.js These simple HTML text fields are cool as a starter but I really wanted to integrate the REPL with [xterm.js](https://xtermjs.org/). Long story short, I had to slightly customise the build configuration to force emscripten to spew out ES6 compatible modules. The repository is available [here](https://gitlab.com/twdev_projects/luaterm/). Below is a working integration for you to enjoy. {{< jswasm.inline >}}
{{< /jswasm.inline >}}