#6 WebAssembly and C++: Bridging native code and asynchronous JavaScript
This post is part of a WebAssembly series focused on WASM and C++. The goal is to gain a thorough understanding of how WebAssembly works, how to use it as a compilation target for C++ code and hopefully have fun along the way. So, stick with me for this exciting journey.
Recap
In the previous post, I’ve compiled Lua interpreter to WASM using emscripten and successfully run it using node. There was a problem running the same code in the browser as blocking IO is not possible. Today I’m gonna try to address this issue and run Lua interpreter in the browser.
There are links to compiled WASM demos in this post so, you can test the code yourself.
Problem definition
Lua interpreter is a REPL running in a tight while
loop which is
blocked on a call to fgets
most of the time - synchronously
waiting for input.
Looking at the source code, there’s a pushline
function:
|
|
This function is called in loadline
and the latter is called in
a tight while loop:
|
|
lua_readline
is a macro which results in a call to fgets
or
readline
- depending on the platform.
To port Lua to WASM
and be able to run it in the browser, this
synchronous wait has to be replaced with either polling or an
asynchronous approach.
Limitations
When working with WASM and JavaScript, the most fundamental principle is that we can’t block JavaScript’s thread. WASM functions can’t run busy loops. Cooperative scheduling has to be implemented in WASM module, which means that when you want to wait for something, the control has to be relinquished back to JavaScript.
Experiments
Let’s put Lua to a side for a moment and focus on the mechanics and interoperability between WASM and JavaScript to better formulate the approach.
I’m gonna start with a toy project containing a model of Lua’s REPL. Once I’m able to run it in the browser, the approach to porting Lua (or, in fact, anything else) will be obvious.
|
|
The above is a simplified model of a REPL. There’s blocking wait on fgets
for a new line of input, the “processing”, which in this case is just print
back of the input. I’m gonna define a simple Makefile
to help build this code:
|
|
With the Makefile
in hand, it’s possible to compile repl.cpp
to WASM with just
|
|
Let’s create a simple index.html
file to load the module.
|
|
Don’t mind the rudimentary styling, it’s just to make the text fields bigger.
Pointing the browser to this document (remember to use http server) greets us
with an input prompt - that’s the default implementation emscripten provides
for fgets
. After hitting cancel, the page becomes unresponsive and it’s
visible in the JavaScript console that we’re just spinning indefinitely in the
while
loop.
emscripten wraps the
requestAnimationFrame
API to provide a way to break busy loops like this. It’s called emscripten_set_main_loop
.
Let’s use that in the WASM module.
|
|
No more busy loops for WASM target! After reloading the page, the browser no
longer hangs. Now comes the implementation of fgets
.
Polling
First, let’s add a callback to input textarea
.
|
|
I’m gonna save the above as index.js
. I’m referring to Module
here. This
is a global which I’m gonna create in module.js
:
|
|
This extends the environment for WASM module since, the JavaScript wrapper that
emscripten generates, picks up the Module
if it already exists:
|
|
The last thing is to include all these scripts in the HTML file.
|
|
A quick test reveals that the character collection works as visible below.
Now, it’s time for fgets
implementation in C++.
|
|
First, a cool feature that emscripten offers, EM_JS
macro generates an
extern
symbol in C++ and automatically adds the JavaScript function to the
WASM environment. I’m using it to create the other end of the input FIFO -
js_getchar
drains pending_input
array if there’s any characters available.
em_fgets
is just a simple loop that glues the characters together to assemble
the string. The important bit is a call to emscripten_sleep
- this yields
the control back to JavaScript so, in fact, the loop is not a tight locked loop
but broken every iteration with a call to emscripten_sleep
. This is cool as
to C++ it looks like a synchronous call while it’s actually a form of
coroutine. Small modifications to the doRepl
are required as well:
|
|
There’s one more thing. Instead of printing to JavaScript console, it would be
nice to append output to the textarea
that I specifically created for that
purpose. To do that, I’m gonna implement another override in the Module
object.
|
|
It’s time to test the code. After recompiling and reloading everything, it’s visible that the REPL is working. WASM is polling for input every 100ms.
async
Polling is a viable option but it’s not preferred. It’s suboptimal, introduces input lag and unnecessary overhead. It’s better to use a fully asynchronous approach instead.
I’m gonna reimplement fgets
one more time but now, it’s gonna be an
asynchronous JavaScript function. This is possible thanks to ASYNCIFY
.
Within repl.cpp
I’m gonna define em_fgets
the following way.
|
|
em_fgets
will be blocked, waiting for a Promise. This promise is only gonna
be completed if a full line of text is collected from input. This line might
be available straight away in Module.pending_lines
or we might have to wait for it.
In case of the latter, the function that resolves the promise is pushed to
Module.pending_fgets
array. Additionally, I have a continuation on the
string value. The JavaScript string has to be copied to WASM memory. This is
something I’ve already discussed in part #3 of this series;
thankfully, emscripten provides a function to perform that conversion for us.
You might’ve noticed that I’ve dropped the FILE*
parameter from the function
signature. That’s just to simplify the code as for the purpose of this use
case it’s completely superfluous.
To make the em_fgets
work, input collection code has to be modified as well
|
|
This code just collects the characters in pending_chars
. Once it sees a
newline it flushes the pending_chars
array as a string to pending_lines
.
If there’s data in pending_lines
and there is at least one Promise resolver in
pending_fgets
, it will be called with the collected input. Of course, the
additional arrays (pending_lines
, pending_fgets
) have to be added to the
global Module
definition.
There’s a small change to the Makefile required as well. stringToUTF8
has to
be explicitly exposed to be visible to WASM:
|
|
That’s it!
Discussed example code can be found here.
Additionally, live demo is available as well.
Back to Lua
Right! To make things a bit more convenient I’m gonna switch to Lua’s git repo.
I’m gonna work on tw/wasm
branch. The plan is to replace readline
with
code implemented in JavaScript - as previously with fgets
.
Changes in makefile
are minimal. I’ve removed the compiler being hardcoded
to gcc
and basically just added required emscripten defines - nothing more
than that.
|
|
Changes in lua.c
are limited as well
|
|
This is a one-to-one copy of the function I’ve already implemented in the toy project.
With all of that in place, it’s possible to just
emmake make
That’s it! The supporting files that I’ve used in the toy project can be used without any additional changes to run the application.
Fork of lua repo is available here.
Working demo is available here.
Integration with xterm.js
These simple HTML text fields are cool as a starter but I really wanted to integrate the REPL with xterm.js. Long story short, I had to slightly customise the build configuration to force emscripten to spew out ES6 compatible modules. The repository is available here.
Below is a working integration for you to enjoy.