#3 WebAssembly and C++: Passing strings between C++ and Javascript
This post is part of a WebAssembly series focused on WASM and C++. The goal is to gain a thorough understanding of how WebAssembly works, how to use it as a compilation target for C++ code and hopefully have fun along the way. So, stick with me for this exciting journey.
Wherever mentioned, working WASM examples will be embedded directly on the page. If your browser supports it, you should be able to see them running.
Interoperability
So far, the data types I’ve passed around between WASM module and JavaScript were extremely simple. In fact, I could count them using fingers of one hand. To be specific, the used types were:
unsigned
float
unsigned*
That’s it! If your application is doing most of the work on WASM side and the API that it exposes is simple then this might be sufficient but in reality it rarely will be. We need to learn how to exchange strings and structured data.
Memory
It’s worth reminding that we’re dealing with two, distinct and separate memory systems here. WASM has its own memory area separate from JavaScript. You can’t use JavaScript’s memory directly in WASM. Similarly, WASM memory is not directly useful in JavaScript.
Same principles apply to memory management. WASM memory must be managed exclusively by WASM module and by the same token, WASM module cannot manage JavaScript’s memory in any shape or form.
Passing strings between C++ and JavaScript
Decoding WASM strings
First step will be to create a very simple WASM module that will return a static string:
|
|
If I call str_ret
from within JavaScript, it’s just gonna return a pointer
to WASM module memory.
|
|
Here’s the console:
Using this pointer and a handle to WASM module’s memory
(wasm.instance.exports.memory
), the string has to be recreated on the
JavaScript side. But to do that, string’s length has to be known as well;
this is the reason for str_len
function which I implemented in C++ as well
(I can’t use strlen
since I’m still operating in standalone mode without
C/C++ standard library).
I need an instance of TextDecoder to perform the conversion. The decode method needs a buffer; the easiest way is to provide an instance of DataView. Here’s how to do all of that:
|
|
Here’s the console screenshot:
Encoding Javascript strings
Passing strings from Javascript back to WASM happens very similarly. For the purpose of this example, I’ll implement a simple function in C++, which counts digits within a string:
|
|
I’ll need an instance of TextEncoder to encode JavaScript string to an array of byte characters.
But… there’s a problem. I can’t just randomly write some data wherever I
want into WASM memory. Should I have access to malloc
and some sorts of heap
management facilities, that would be simple since I could just ask WASM module
to allocate memory for me that I could use. In standalone mode, it’s not that easy.
The workaround, useful for the sake of this contrived example is to explicitly allocate more memory for WASM module:
|
|
This is described on the Memory.grow page.
It’s becoming quite apparent I hope, that in the long run, this approach won’t scale and might be applicable only to a narrow specific use cases. Despite that, let’s continue.
I’ve got the memory, It’s time to write data to it. Here’s how the updated JavaScript code looks like:
|
|
Console output:
Structured data
What about structured data, like classes
or structs
passed between C++ and JS?
In short, the same principles apply as for strings. Whatever is returned
from WASM is an opaque handle to Javascript and has to be somehow converted
to Javascript objects. Therefore, having the following code:
|
|
The invocation of makePair
from Javascript, will return a … pointer. Yep,
it doesn’t matter if you’re returning by value or by pointer explicitly.
makePair
returns a pointer to a fragment of WASM memory representing a
Pair
. Javascript knows nothing about this data structure. There’s no way to
handle it explicitly or assume its internal layout. To convert it to
Javascript object, we’d need functions in C++ allowing access to the data
since, Pair
itself, in Javascript, is just an opaque handle. For example:
|
|
This later on can be used in Javascript:
|
|
Code examples
You can find the discussed example code in a github repository created for the purpose of this post.
Conclusion
Passing structured data between JS and C++ requires serialisation. Something like a protocol buffer, JSON or msgpack. To have that working the facilities that standard C++ library provides are really a must. Therefore, in future instalments of this series, I’m gonna focus on details how to use it and how to instrument an integration layer between two environments.