Contents

#5 WebAssembly and C++: Porting Lua to WebAssembly

This post is part of a WebAssembly series focused on WASM and C++. The goal is to gain a thorough understanding of how WebAssembly works, how to use it as a compilation target for C++ code and hopefully have fun along the way. So, stick with me for this exciting journey.

Recap

In the previous instalment of this series, we’ve learned that WASI is an interface enabling exposure of system APIs to WASM modules. WASI is implemented by all relevant WASM runtimes; I’ve also provided an example of what needs to be done to implement WASI polyfills by hand and how to use more mature and complete implementation of WASI polyfills in the browser.

Today, I’m gonna experiment porting a real piece of software to WebAssembly. I’ve chosen lua project. The main reason is that it’s nice and small which makes it a perfect candidate to tinker with.

Getting lua

Just clone the repo. The version doesn’t really matter. Since I’m lazy, I’m gonna use my OS’s (Arch) package manager (pacman) to get the sources:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
$ pkgctl repo clone --protocol https lua
$ cd lua
$ makepkg -o
$ tree -L 3
.
├── liblua.so.patch
├── LICENSE
├── lua-5.4.6.tar.gz
├── lua.pc
├── paths.patch
├── PKGBUILD
└── src
    ├── liblua.so.patch -> /home/tomasz/lua_wasm/lua/liblua.so.patch
    ├── LICENSE -> /home/tomasz/lua_wasm/lua/LICENSE
    ├── lua++-5.4.6
    │   ├── doc
    │   ├── lua++.pc
    │   ├── Makefile
    │   ├── README
    │   └── src
    ├── lua-5.4.6
    │   ├── doc
    │   ├── lua.pc
    │   ├── Makefile
    │   ├── README
    │   └── src
    ├── lua-5.4.6.tar.gz -> /home/tomasz/lua_wasm/lua/lua-5.4.6.tar.gz
    ├── lua.pc -> /home/tomasz/lua_wasm/lua/lua.pc
    └── paths.patch -> /home/tomasz/lua_wasm/lua/paths.patch

I’m gonna copy unpacked lua-5.4.6 directory, initialise git repo inside of it and start messing about.

1
2
3
4
5
6
$ cp -r src/lua-5.4.6 ..                   
$ cd ../lua-5.4.6/
$ git init .                               
Initialized empty Git repository in /home/tomasz/lua_wasm/lua-5.4.6/.git/
$ git add *
$ git commit -m "vanila"

Building with WASI-SDK

Let’s first try to build it with wasi-sdk - the very same one, I’ve already installed in part #4 of this series. For the sake of convenience, I’ve used wasmenv.sh containing the environment.

1
2
3
4
5
6
export WASI_VERSION=20
export WASI_VERSION_FULL=${WASI_VERSION}.0
export WASI_SDK_PATH=/opt/wasi-sdk-${WASI_VERSION_FULL}
export PATH="${WASI_SDK_PATH}/bin:$PATH"
export CC="${WASI_SDK_PATH}/bin/clang --sysroot=${WASI_SDK_PATH}/share/wasi-sysroot"
export CXX="${WASI_SDK_PATH}/bin/clang++ --sysroot=${WASI_SDK_PATH}/share/wasi-sysroot"

Since then, I’ve incorporated a new tool into my workflow which is direnv. So, this time around, I’m gonna save that the environment into .envrc file and simply run:

1
$ direnv allow

This will automatically import the environment whenever entering the directory containing .envrc.

Lua’s build system

Lua’s build system is extremely simple, the whole project is built with just two Makefiles. Top level Makefile just descends to src where the actual build rules are. This is another reason why I’ve chosen Lua to begin with so, I’m not distracted with the project’s technicalities. To build with WASI SDK, I’m just gonna remove the CC definition from the Makefile; I’m providing this via the environment. Additionally, I’m gonna specify the target in MYCFLAGS. Here’s what I’ve changed:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
--- a/src/Makefile
+++ b/src/Makefile
@@ -6,7 +6,6 @@
 # Your platform. See PLATS for possible values.
 PLAT= guess
 
-CC= gcc -std=gnu99
 CFLAGS= -O2 -Wall -Wextra -DLUA_COMPAT_5_3 $(SYSCFLAGS) $(MYCFLAGS)
 LDFLAGS= $(SYSLDFLAGS) $(MYLDFLAGS)
 LIBS= -lm $(SYSLIBS) $(MYLIBS)
@@ -20,7 +19,7 @@ SYSCFLAGS=
 SYSLDFLAGS=
 SYSLIBS=
 
-MYCFLAGS=
+MYCFLAGS=--target=wasm32-wasi


 MYLDFLAGS=
 MYLIBS=
 MYOBJS=

Here’s our first roadblock. It seems that WASI does not support POSIX signals:

1
2
3
/opt/wasi-sdk-20.0/share/wasi-sysroot/include/signal.h:2:2: error: "wasm lacks
signal support; to enable minimal signal emulation, compile with
-D_WASI_EMULATED_SIGNAL and link with -lwasi-emulated-signal"

Let’s follow the advice, enable the emulation and see how far this will take us.

1
2
3
4
-MYCFLAGS=
-MYLDFLAGS=
+MYCFLAGS=--target=wasm32-wasi -D_WASI_EMULATED_SIGNAL
+MYLDFLAGS=-lwasi-emulated-signal

That allowed for some further progress; however, there’s another roadblock which seems to be more difficult to deal with.

1
2
3
4
5
/opt/wasi-sdk-20.0/bin/clang --sysroot=/opt/wasi-sdk-20.0/share/wasi-sysroot
-O2 -Wall -Wextra -DLUA_COMPAT_5_3 -DLUA_USE_LINUX --target=wasm32-wasi
-D_WASI_EMULATED_SIGNAL   -c -o ldo.o ldo.c ldo.c:13:10: fatal error:
'setjmp.h' file not found
#include <setjmp.h>

Lua implements error handling with setjmp/longjmp when compiled with a C compiler. WASI-SDK does not port setjmp/longjmp and doesn’t provide the required headers. This could be mitigated by compiling with C++ compiler since then, error handling is done with exceptions. Let’s try that.

Now, the build almost completes but it’s failing during the linking stage.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
clang-16: warning: argument unused during compilation: '-shared' [-Wunused-command-line-argument]
/opt/wasi-sdk-20.0/bin/clang++ --sysroot=/opt/wasi-sdk-20.0/share/wasi-sysroot
-o luac  -lwasi-emulated-signal luac.o liblua.a -lm -Wl,-E -ldl 
wasm-ld: error: unknown argument: -soname
wasm-ld: error: cannot open liblua.so.5.4: No such file or directory
clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
make[3]: *** [Makefile:64: liblua.so] Error 1
make[3]: *** Waiting for unfinished jobs....
wasm-ld: error: liblua.a(ldo.o): undefined symbol: __cxa_allocate_exception
wasm-ld: error: liblua.a(ldo.o): undefined symbol: __cxa_throw
clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
make[3]: *** [Makefile:73: luac] Error 1
wasm-ld: error: liblua.a(ldo.o): undefined symbol: __cxa_allocate_exception
wasm-ld: error: liblua.a(ldo.o): undefined symbol: __cxa_throw
wasm-ld: error: liblua.a(ltablib.o): undefined symbol: clock
wasm-ld: error: liblua.a(liolib.o): undefined symbol: tmpfile
wasm-ld: error: liblua.a(ltablib.o): undefined symbol: clock
wasm-ld: error: liblua.a(loslib.o): undefined symbol: system
wasm-ld: error: liblua.a(loslib.o): undefined symbol: tmpnam

There’s a host of interesting warnings as well.

1
2
/opt/wasi-sdk-20.0/share/wasi-sysroot/include/stdio.h:152:37: note: 'tmpnam' has been explicitly marked deprecated here
char *tmpnam(char *) __attribute__((__deprecated__("tmpnam is not defined on WASI")));                                                 
1
2
3
4
5
loslib.c:184:34: warning: 'clock' is deprecated: WASI lacks process-associated
clocks; to enable emulation of the `clock` function using the wall clock,
which isn't sensitive to whether the program is running or suspended, compile
with -D_WASI_EMULATED_PROCESS_CLOCKS and link with
-lwasi-emulated-process-clocks [-Wdeprecated-declarations]  

Some of these problems can be dealt with or worked around; however, the linker reveals lack of exceptions support and this is a show stopper. WASM as such, does not support exception handling. There’s a proposal to add exceptions to WASM but it’s currently under discussion and will require heavy changes to both WASM, WASI and WAT itself as well. We’ve reached a dead-end.

Enter, Emscripten

Emscripten is a comprehensive solution providing both the toolchain to port C/C++ code to WASM and ports of most commonly used libraries as well. It builds on top of WASI to extend the support for libc and standard C++ library, providing a more complete and robust experience. Additionally, it automatically generates JS polyfills wherever required, provides support for dynamic linkage (by introducing side-modules) and does much more. It implements an embedded FS layer as well.

Let’s purge all that’ve been done so far and start again with emscripten.

emscripten provides wrappers for build systems to make the integration seamless. In case of a simple make based build system, it’s just emmake. Modifications to the Makefile are minimal:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
diff --git a/src/Makefile b/src/Makefile
index 6fac473..dd9cc35 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -6,13 +6,12 @@
 # Your platform. See PLATS for possible values.
 PLAT= guess
 
-CC= gcc -std=gnu99
 CFLAGS= -O2 -Wall -Wextra -DLUA_COMPAT_5_3 $(SYSCFLAGS) $(MYCFLAGS)
 LDFLAGS= $(SYSLDFLAGS) $(MYLDFLAGS)
 LIBS= -lm $(SYSLIBS) $(MYLIBS)
 
-AR= ar rcu
-RANLIB= ranlib
+AR= emar rcu
+RANLIB= emranlib
 RM= rm -f
 UNAME= uname
 
@@ -21,7 +20,7 @@ SYSLDFLAGS=
 SYSLIBS=
 
 MYCFLAGS=
-MYLDFLAGS=
+MYLDFLAGS=-sNODERAWFS
 MYLIBS=
 MYOBJS=

I’ve basically just replaced the ar and ranlib tools with their emscripten equivalents. Additionally, I’ve enabled node.js filesystem emulation - which means that emscripten will provide IO routines that map directly to node’s fs module.

Building is trivial.

emmake make

Surprisingly, it succeeded on first attempt. Aside of the WASM module, the resulting target file is a JavaScript file.

1
2
3
4
5
$ file src/lua
src/lua: JavaScript source, ASCII text, with very long lines (338)

2024:hermod lua-5.4.6 0 (master *) $ file src/lua.wasm 
src/lua.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)

It’s worth to have a closer look on the imports.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
$ wasm2wat src/lua.wasm | grep import
  (import "env" "abort" (func (;0;) (type 12)))
  (import "env" "invoke_vii" (func (;1;) (type 5)))
  (import "env" "strftime" (func (;2;) (type 3)))
  (import "env" "system" (func (;3;) (type 0)))
  (import "env" "exit" (func (;4;) (type 6)))
  (import "wasi_snapshot_preview1" "fd_close" (func (;5;) (type 0)))
  (import "env" "emscripten_memcpy_js" (func (;6;) (type 5)))
  (import "env" "emscripten_date_now" (func (;7;) (type 22)))
  (import "env" "_emscripten_get_now_is_monotonic" (func (;8;) (type 7)))
  (import "env" "emscripten_get_now" (func (;9;) (type 22)))
  (import "env" "emscripten_get_now_res" (func (;10;) (type 22)))
  (import "env" "__syscall_openat" (func (;11;) (type 3)))
  ...

Aside of the imports from wasi_snapshot_preview1 object (which are WASI specific), imports from env object are noticeable. These are all emscripten polyfills.

The JavaScript file that emscripten produced contains the code for all of them.

Let’s have a look at one. Picking arbitrarily strftime, WASM wants to import that from env object.

Inside lua (which is a JavaScript file) there’s an implementation for the function (which, to be honest, is quite complex):

1
2
3
var _strftime = (s, maxsize, format, tm) => {
  ...
};

This is mapped in wasmImports object:

1
2
3
4
5
6
var wasmImports = {
  ...
  /** @export */
  strftime: _strftime,
  ...
};

This is used as an import object when instantiating WASM module:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
function createWasm() {
  // prepare imports
  var info = {
    'env': wasmImports,
    'wasi_snapshot_preview1': wasmImports,
  };

  ...

  instantiateAsync(wasmBinary, wasmBinaryFile, info, receiveInstantiationResult);
  return {}; // no exports yet; we'll fill them in later
}


function instantiateAsync(binary, binaryFile, imports, callback) {
    ...
    return fetch(binaryFile, { credentials: 'same-origin' }).then((response) => {
      ...
      var result = WebAssembly.instantiateStreaming(response, imports);
      ...
      return result;
      ...
    });
    ...
}

For brevity, I’ve removed all non-essential code.

Additionally, emscripten produces a universal JavaScript wrapper which means that the same file can be used both in the browser and with node.js.

Sure enough, trying out with node.js, it runs without any problems at all:

1
2
3
4
$ node src/lua
Lua 5.4.6  Copyright (C) 1994-2023 Lua.org, PUC-Rio
> print("hello from lua") 
hello from lua

Running in the browser?

As I said, the generated JavaScript wrapper is directly usable in the browser. HTML wrapper is still required though. emscripten can generate it for us as well. This can be done by modifying the target in the Makefile by simply adding the .html extension.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
diff --git a/src/Makefile b/src/Makefile
index 6fac473..45e7c31 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -6,13 +6,12 @@
 
...
 
-LUA_T= lua
+LUA_T= lua.html
 LUA_O= lua.o

emscripten even provides a convenient way to run HTML apps so, after rebuilding, simply emrun:

emrun src/lua.html

Straight away, it seems like there’s a bit of a problem though.

/wasm/wasm_cpp_05/screenshot.png

Yep, there’s no interactive terminal so, it tries to obtain input using JS prompt popup windows. This is bad.

How to fix that? Well, since this post is already getting quite long, I’m gonna leave this one as a cliff hanger and continue in the next one. Long story short, it’s all about blocking IO (or rather lack of it).