# #6 WebAssembly and C++: Bridging native code and asynchronous JavaScript


This post is part of a [WebAssembly series](/tags/wasmcpp) focused on WASM and
C++. The goal is to gain a thorough understanding of how WebAssembly works, how
to use it as a compilation target for C++ code and hopefully have fun along the
way. So, stick with me for this exciting journey.

## Recap

In the previous post, I've compiled Lua interpreter to WASM using
emscripten and successfully run it using node.  There was a
problem running the same code in the browser as blocking IO is not
possible. Today I'm gonna try to address this issue and run Lua
interpreter in the browser.

There are links to compiled WASM demos in this post so, you can test the code
yourself.

## Problem definition

Lua interpreter is a REPL running in a tight `while` loop which is
blocked on a call to `fgets` most of the time - synchronously
waiting for input.

Looking at the source code, there's a `pushline` function:

```C
static int pushline (lua_State *L, int firstline) {
  char buffer[LUA_MAXINPUT];
  char *b = buffer;
  size_t l;
  const char *prmt = get_prompt(L, firstline);
  int readstatus = lua_readline(L, b, prmt);
  if (readstatus == 0)
    return 0;  /* no input (prompt will be popped by caller) */
  ...
  return 1;
}
```

This function is called in `loadline` and the latter is called in
a tight while loop:

```C
static void doREPL (lua_State *L) {
  int status;
  const char *oldprogname = progname;
  progname = NULL;  /* no 'progname' on errors in interactive mode */
  lua_initreadline(L);
  while ((status = loadline(L)) != -1) {
    if (status == LUA_OK)
      status = docall(L, 0, LUA_MULTRET);
    if (status == LUA_OK) l_print(L);
    else report(L, status);
  }
  lua_settop(L, 0);  /* clear stack */
  lua_writeline();
  progname = oldprogname;
}
```

`lua_readline` is a macro which results in a call to `fgets` or
`readline` - depending on the platform.

**To port Lua to `WASM` and be able to run it in the browser, this
synchronous wait has to be replaced with either polling or an
asynchronous approach.**

## Limitations

When working with WASM and JavaScript, the most fundamental principle is that
**we can't block JavaScript's thread**.  WASM functions can't run busy loops.
Cooperative scheduling has to be implemented in WASM module, which means that
when you want to wait for something, the control has to be relinquished back to
JavaScript.

## Experiments

Let's put Lua to a side for a moment and focus on the mechanics
and interoperability between WASM and JavaScript to better
formulate the approach.


I'm gonna start with a toy project containing a model of Lua's REPL.  Once I'm
able to run it in the browser, the approach to porting Lua (or, in fact,
anything else) will be obvious.

```C++
#include <cstdio>
#include <cstring>

int main(int argc, const char *argv[]) {
  printf("I'm an echo\n");

  constexpr std::size_t MAXBUF = 1024;
  char buffer[MAXBUF];

  while (true) {
    fgets(buffer, MAXBUF, stdin);
    printf("Echo: %s\n", buffer);
  }

  return 0;
}
```

The above is a simplified model of a REPL.  There's blocking wait on `fgets`
for a new line of input, the "processing", which in this case is just print
back of the input. I'm gonna define a simple `Makefile` to help build this code:

```Makefile
.PHONY: clean

all: repl.js

repl.js: repl.cpp
	$(CXX) $(CXXFLAGS) -o $@ $< -sWASM=1

clean:
	rm -fv repl.js repl.wasm
```

With the `Makefile` in hand, it's possible to compile `repl.cpp` to WASM with just

```console
$ emmake make
```

Let's create a simple `index.html` file to load the module.

```html
<!doctype html>
<html>
    <head></head>
    <body>
        <h1>REPL</h1>
        <div id="repl">
            <p>Output:</p>
            <textarea 
                id="output" 
                readonly 
                style="width: 80%; height: 10em"></textarea>

            <p>Input:</p>
            <textarea 
                id="input" 
                placeholder="Enter your input here" 
                style="width: 80%; height: 10em"></textarea>
        </div>
        <script src="repl.js"></script>
    </body>
</html>
```

Don't mind the rudimentary styling, it's just to make the text fields bigger.

![browser](images/app_model.png)

Pointing the browser to this document (remember to use http server) greets us
with an input prompt - that's the default implementation emscripten provides
for `fgets`.  After hitting cancel, the page becomes unresponsive and it's
visible in the JavaScript console that we're just spinning indefinitely in the
`while` loop.

![prompt](images/input_prompt.png)

![browser busy](images/busy_loop.png)

emscripten wraps the
[`requestAnimationFrame`](https://developer.mozilla.org/en-US/docs/Web/API/window/requestAnimationFrame)
API to provide a way to break busy loops like this.  It's called [`emscripten_set_main_loop`](https://emscripten.org/docs/api_reference/emscripten.h.html#c.emscripten_set_main_loop).
Let's use that in the WASM module.

```C++
#include <cstdio>
#include <cstring>

#ifdef __EMSCRIPTEN__
#include <emscripten.h>
#endif

void doRepl() {
  constexpr std::size_t MAXBUF = 1024;
  char buffer[MAXBUF];

  fgets(buffer, MAXBUF, stdin);
  printf("Echo: %s\n", buffer);
}

int main(int argc, const char *argv[]) {
  printf("I'm an echo\n");

#ifdef __EMSCRIPTEN__
  int fps = 30;
  int simulate_infinite_loop = 1;
  emscripten_set_main_loop(doRepl, fps, simulate_infinite_loop);
#else
  while (true) {
    doRepl();
  }
#endif

  return 0;
}
```

No more busy loops for WASM target!  After reloading the page, the browser no
longer hangs.  Now comes the implementation of `fgets`.

### Polling

First, let's add a callback to input `textarea`.

```JavaScript
const input_element = document.getElementById('input');

input_element.addEventListener('keypress', (e) => {
    if (e.key === 'Enter') {
        e.preventDefault();
        Module.pending_input.push('\n'.charCodeAt(0));
        input_element.value = '';
    } else {
        Module.pending_input.push(e.key.charCodeAt(0));
    }
});
```

I'm gonna save the above as `index.js`.  I'm referring to `Module` here.  This
is a global which I'm gonna create in `module.js`:

```JavaScript
var Module = {
  pending_input : [],
};
```

This extends the environment for WASM module since, the JavaScript wrapper that
emscripten generates, picks up the `Module` if it already exists:

```JavaScript
...
var Module = typeof Module != 'undefined' ? Module : {};
...
```

The last thing is to include all these scripts in the HTML file.

```HTML
...
            <textarea 
                id="input" 
                placeholder="Enter your input here" 
                style="width: 80%; height: 10em"></textarea>
        </div>
        <script src="module.js"></script>
        <script src="index.js"></script>
        <script src="repl.js"></script>
...
```
A quick test reveals that the character collection works as visible below.

![character collection](images/pending_input.png)

Now, it's time for `fgets` implementation in C++.

```C++
#ifdef __EMSCRIPTEN__

EM_JS(int, js_getchar, (), { 
    if (Module.pending_input.length == 0) {
      return 0;
    }
    return Module.pending_input.shift();
});

char *em_fgets(char *buf, std::size_t size, FILE *stream) {
  while (true) {     
    if (int c = js_getchar()) {
      if (c == '\n') {           
        *buf = '\0';                       
        return buf;                                                                         
      } else {                                                                              
        if (size == 1) {
          *buf = '\0';
          return buf;
        }
        *buf++ = c;
        size--;
      }                              
      continue;
    }
    emscripten_sleep(100);                                                                                                                                                              
  }

  return NULL;
}

#endif
```

First, a cool feature that emscripten offers, `EM_JS` macro generates an
`extern` symbol in C++ and automatically adds the JavaScript function to the
WASM environment.  I'm using it to create the other end of the input FIFO -
`js_getchar` drains `pending_input` array if there's any characters available.

`em_fgets` is just a simple loop that glues the characters together to assemble
the string.  The important bit is a call to `emscripten_sleep` - this yields
the control back to JavaScript so, in fact, the loop is not a tight locked loop
but broken every iteration with a call to `emscripten_sleep`.  This is cool as
to C++ it looks like a synchronous call while it's actually a form of
coroutine.  Small modifications to the `doRepl` are required as well:

```C++
void doRepl() {
  ...

#ifdef __EMSCRIPTEN__
  p = em_fgets(buffer, MAXBUF, stdin);
#else
  p = fgets(buffer, MAXBUF, stdin);
#endif
  ...
}
```

There's one more thing.  Instead of printing to JavaScript console, it would be
nice to append output to the `textarea` that I specifically created for that
purpose.  To do that, I'm gonna implement another override in the `Module`
object.

```JavaScript
var Module = {
  ...
  print : (text) => {
    const output_element = document.getElementById('output');
    output_element.textContent += text + '\n';
  },
};
```

It's time to test the code.  After recompiling and reloading everything, it's
visible that the REPL is working.  WASM is polling for input every 100ms.

![repl with polling](images/polling.png)

### async

Polling is a viable option but it's not preferred.  It's suboptimal, introduces
input lag and unnecessary overhead.  It's better to use a fully asynchronous
approach instead.

I'm gonna reimplement `fgets` one more time but now, it's gonna be an
asynchronous JavaScript function.  This is possible thanks to `ASYNCIFY`.
Within `repl.cpp` I'm gonna define `em_fgets` the following way.

```C++
EM_ASYNC_JS(char *, em_fgets, (const char* buf, size_t bufsize), {
  return await new Promise((resolve, reject) => {
      if (Module.pending_lines.length > 0) {
        resolve(Module.pending_lines.shift());
      } else {
        Module.pending_fgets.push(resolve);
      }
  }).then((s) => {
      // convert JS string to WASM string
      let l = s.length + 1;
      if (l >= bufsize) {
        // truncate
        l = bufsize - 1;
      }
      Module.stringToUTF8(s.slice(0, l), buf, l);
      return buf;
  });
});
```

`em_fgets` will be blocked, waiting for a Promise.  This promise is only gonna
be completed if a full line of text is collected from input.  This line might
be available straight away in `Module.pending_lines` or we might have to wait for it.
In case of the latter, the function that resolves the promise is pushed to
`Module.pending_fgets` array.  Additionally, I have a continuation on the
string value.  The JavaScript string has to be copied to WASM memory.  This is
something I've already discussed in [part #3 of this series]({{< relref "posts/wasm_cpp_03" >}});
thankfully, emscripten provides a function to perform that conversion for us.

You might've noticed that I've dropped the `FILE*` parameter from the function
signature.  That's just to simplify the code as for the purpose of this use
case it's completely superfluous.

To make the `em_fgets` work, input collection code has to be modified as well

```JavaScript
const input_element = document.getElementById('input');

input_element.addEventListener('keypress', function(e) {
  const isEnter = e.key === 'Enter';

  if (isEnter) {
    e.preventDefault();
    Module.pending_lines.push(Module.pending_chars.join(''));
    Module.pending_chars = [];
    input_element.value = '';
  } else {
    Module.pending_chars.push(e.key);
  }

  if (Module.pending_fgets.length > 0 && Module.pending_lines.length > 0) {
    let resolver = Module.pending_fgets.shift();
    resolver(Module.pending_lines.shift());
  }
});
```

This code just collects the characters in `pending_chars`.  Once it sees a
newline it flushes the `pending_chars` array as a string to `pending_lines`.
If there's data in `pending_lines` and there is at least one Promise resolver in
`pending_fgets`, it will be called with the collected input.  Of course, the
additional arrays (`pending_lines`, `pending_fgets`) have to be added to the
global `Module` definition.

There's a small change to the Makefile required as well.  `stringToUTF8` has to
be explicitly exposed to be visible to WASM:

```Makefile
.PHONY: clean

all: repl.js

repl.js: repl.cpp
        $(CXX) $(CXXFLAGS) -sWASM=1 -sASYNCIFY -sEXPORTED_RUNTIME_METHODS=stringToUTF8 -o $@ $<
```

That's it!

Discussed example code can be found [here](https://gitlab.com/twdev_projects/wasm_async_experiments).

Additionally, [live demo is available as well](https://wasm-async-experiments-twdev-projects-08da1525ad78ab970e0709b2c.gitlab.io/).

## Back to Lua

Right!  To make things a bit more convenient I'm gonna switch to Lua's git repo.

I'm gonna work on `tw/wasm` branch.  The plan is to replace `readline` with
code implemented in JavaScript - as previously with `fgets`.

Changes in `makefile` are minimal.  I've removed the compiler being hardcoded
to `gcc` and basically just added required emscripten defines - nothing more
than that.

```Makefile
--- a/makefile
+++ b/makefile
@@ -72,13 +72,12 @@ LOCAL = $(TESTS) $(CWARNS)
 # enable Linux goodies
-MYLIBS= -ldl -lreadline
+MYLIBS= -ldl -sASYNCIFY -sEXPORTED_RUNTIME_METHODS=stringToUTF8
 
 
-CC= gcc
-CFLAGS= -Wall -O2 $(MYCFLAGS) -fno-stack-protector -fno-common -march=native
-AR= ar rc
-RANLIB= ranlib
+CFLAGS= -Wall -O2 $(MYCFLAGS) -fno-stack-protector -fno-common
+AR= emar rc
+RANLIB= emranlib
 RM= rm -f
 
 
@@ -96,7 +95,7 @@ AUX_O=        lauxlib.o
 LIB_O= lbaselib.o ldblib.o liolib.o lmathlib.o loslib.o ltablib.o lstrlib.o \
        lutf8lib.o loadlib.o lcorolib.o linit.o
 
-LUA_T= lua
+LUA_T= lua.js
```

Changes in `lua.c` are limited as well

```C++
+#ifdef __EMSCRIPTEN__
+#include <emscripten.h>
+
+EM_ASYNC_JS(char *, em_fgets, (const char* buf, size_t bufsize), {
+  return await new Promise((resolve, reject) => {
+      if (Module.pending_lines.length > 0) {
+        resolve(Module.pending_lines.shift());
+      } else {
+        Module.pending_fgets.push(resolve);
+      }
+  }).then((s) => {
+      // convert JS string to WASM string
+      let l = s.length + 1;
+      if (l >= bufsize) {
+        // truncate
+        l = bufsize - 1;
+      }
+      Module.stringToUTF8(s.slice(0, l), buf, l);
+      return buf;
+  });
+});
+
+static char* readline(const char* prompt) {
+    char* buf = malloc(LUA_MAXINPUT);
+    em_fgets(buf, LUA_MAXINPUT);
+    return buf;
+}
+
+#define lua_initreadline(L) ((void)L)
+#define lua_readline(L,b,p)  ((void)L, ((b)=readline(p)) != NULL)
+#define lua_saveline(L,line) ((void)L)
+#define lua_freeline(L,b) ((void)L, free(b))
+
+#else
 #include <readline/readline.h>
 #include <readline/history.h>
+
 #define lua_initreadline(L)    ((void)L, rl_readline_name="lua")
 #define lua_readline(L,b,p)    ((void)L, ((b)=readline(p)) != NULL)
 #define lua_saveline(L,line)   ((void)L, add_history(line))
 #define lua_freeline(L,b)      ((void)L, free(b))
 
+#endif
+
```

This is a one-to-one copy of the function I've already implemented in the toy
project.

With all of that in place, it's possible to just

    emmake make

That's it!  The supporting files that I've used in the toy project can be used
without any additional changes to run the application.

![lua repl](images/lua_repl.png)

Fork of lua repo is available [here](https://gitlab.com/twdev_projects/lua_wasm/-/tree/tw/wasm?ref_type=heads).

Working [demo is available here](https://lua-wasm-twdev-projects-db592518aeeb33359efee32b83647c311c2aa92.gitlab.io/).

## Integration with xterm.js

These simple HTML text fields are cool as a starter but I really wanted to
integrate the REPL with [xterm.js](https://xtermjs.org/).  Long story short, I
had to slightly customise the build configuration to force emscripten to spew
out ES6 compatible modules.  The repository is available [here](https://gitlab.com/twdev_projects/luaterm/).

Below is a working integration for you to enjoy.

{{< jswasm.inline >}}
<div id="xterm-container"></div>
<script defer="defer" src="wasm/index.bundle.js"></script>
{{< /jswasm.inline >}}

