Sharing structs between JavaScript and native code

24 Sep 2018

Structuring Data Using Structs

When writing native modules in C/C++, it is often useful to structure data using something called structs. A struct is a low level data schema you can use in C/C++. They basically act as an encoding/decoding scheme, similar to protocol-buffers and many others data encoding schemes, to read/write the properties specified in the struct declaration.

Plain Text

#include 
#include </p><p>// declare a struct with two properties
struct test {
  uint32_t an_unsigned_num;
  int32_t a_signed_num;
};</p><p>int main () {
  // instantiate a new struct
  struct test an_instance;</p><p>  // write some properties
  an_instance.a_signed_num = -42;
  an_instance.an_unsigned_num = 42;</p><p>  // read them out again
  printf(“a_signed_num=%i, an_unsigned_num=%u\n”,
    an_instance.a_signed_num,
    an_instance.an_unsigned_num
  );</p><p>  return 0;
}

Try compiling the above example using gcc by doing

Plain Text

gcc program.c -o program
./program

Behind the scene structs are just syntactic sugar for calculating the byte offset of each property specified in the struct declaration. Reading/writing a property value is sugar for reading/writing the value into a buffer at that exact offset. In fact the struct itself is just a buffer that has enough space to hold all the properties. Since we know the type of each of the properties (fx int32_t) and since each type has a static size these offsets aren’t too complicated to calculate.

Looking at the struct from our above example let’s try calculating the offsets and total byte size needed to store all values.

Plain Text

struct test {
  uint32_t an_unsigned_num;
  int32_t a_signed_num;
};

The first property, an_unsigned_num has offset 0 and the size of its type, uint32_t , is 4 bytes. This means the offset of the second property is 4 and since the size of int32_t is also 4, the total struct size is 8.

We can verify this using another C program

Plain Text

#include 
#include 
#include </p><p>struct test {
  uint32_t an_unsigned_num;
  int32_t a_signed_num;
};</p><p>int main () {
  // We can use offsetof and sizeof to verify our above calculation
  printf(“offset of un_signed_num=%zd\n”,
    offsetof(struct test, an_unsigned_num));
  printf(“offset of a_signed_num=%zd\n”,
    offsetof(struct test, a_signed_num));
  printf(“total size of struct=%zd\n”,
    sizeof(struct test));
  return 0;
}

Compiling and running this program should produce similar output to our calculation above.

Type alignment

If all of the types do not have the same size, it’s a little bit more complicated than that. If we were to use a bigger number type such as uint64_t , which is 8 bytes, something interesting happens to our offsets.

Plain Text

#include 
#include 
#include </p><p>struct test {
  uint32_t an_unsigned_num;
  // Use a int64_t instead of int32_t which is 8 bytes instead of 4
  int64_t a_signed_num;
};</p><p>int main () {
  // We can use offsetof and sizeof to verify our above calculation
  printf(“offset of un_signed_num=%zd\n”,
    offsetof(struct test, an_unsigned_num));
  printf(“offset of a_signed_num=%zd\n”,
    offsetof(struct test, a_signed_num));
  printf(“total size of struct=%zd\n”,
    sizeof(struct test));
  return 0;
}

The output of program now changes to

Plain Text

offset of un_signed_num=0
offset of a_signed_num=8
total size of struct=16

Notice how the offset of the property we changed to int64_t changed to 8 instead of 4 even though we didn’t change the size of the previous property? That is because of something called type alignment. When using raw memory in a program your computer uses something called memory pages behind the scene. These pages have a fixed size and to avoid having to store half a number in one page, and the other half in another, your computer prefers storing numbers that have a byte size of 8 at offsets that are divisible by 8. Similarly it prefers to store numbers of byte size 4 at offsets that are divisible by 4. This number is called the type alignment. The total struct size is always divisible by the type alignment of the largest number type in the struct as well. That’s why the total size of the struct in the above example is 16, since 16 is divisible by 8 and large enough to contain all the types.

Using structs from JavaScript

So how do structs relate to JavaScript? If we’re writing a native module we can use them to store properties in Node.js buffers. This is very useful as it allows us to allocate memory in Node.js but still use it in C/C++ programs, without the need for mallocs and other C/C++ ways of organising memory. If we do it in JavaScript we can take advantage of the garbage collector running there as well, which makes memory leaks much easier to deal with.

Let’s look at an example of this, using n-api and the napi-macros module

Plain Text

#include 
// we are using the napi-macros npm module make the native add-on easy to write
// do a npm install napi-macros to install this module
#include </p><p>struct test {
  uint32_t an_unsigned_num;
  int32_t a_signed_num;
};</p><p>// This defines a function we can call from JavaScript
NAPI_METHOD(do_stuff) {
  // This method has 1 argument
  NAPI_ARGV(1)
  // This takes the first argument and interprets it as struct test
  NAPI_ARGV_BUFFER_CAST(struct test *, s, 0)</p><p>
  // Let’s mutate the struct
  s->an_unsigned_num++;
  s->a_signed_num--;</p><p>
  // And return one of the properties to js
  NAPI_RETURN_INT32(s->a_signed_num)
}</p><p>NAPI_INIT() {
  // Export the function from the binding
  NAPI_EXPORT_FUNCTION(do_stuff)
  // Also export the size of the struct so we can allocate it in JavaScript
  NAPI_EXPORT_SIZEOF_STRUCT(test)
}

Try saving the above file as example.c. Then to compile we need to have a gyp file as well

Plain Text

{
  "targets": [{
    "target_name": "example",
    "include_dirs": [
      "<!(node -e \"require('napi-macros')\")"
    ],
    "sources": [
      "example.c"
    ]
  }]
}

Save the above file as binding.gyp.

To compile the add-on, install napi-macros and run node-gyp

Plain Text

npm install napi-macros
node-gyp rebuild

We can now load this add-on from JavaScript and allocate the struct there, pass it to our add-on and mutate it in C.

Plain Text

// Load the add-on
const addon = require(‘./build/Release/example.node’)</p><p>// Allocate the struct. We exported the size from the add-on
// Buffer.alloc zeros the buffer, which is a good idea for struct defaults
const buf = Buffer.alloc(addon.sizeof_test)</p><p>// Call the do_stuff function and pass our struct buffer to C
let signedNumber = addon.do_stuff(buf)</p><p>console.log(‘struct.a_signed_number is’, signedNumber)
// Call do_stuff again. It should return -2 now since we mutated the struct in C twice
signedNumber = addon.do_stuff(buf)</p><p>console.log(‘struct.a_signed_number is’, signedNumber)

We can even log out the buffer using console.log and see that it updates every time we call addon.do_stuff .

Plain Text

// Will print the buffer that is representing the raw memory of the struct
console.log(buf)
addon.do_stuff(buf)
console.log(buf)

Node.js: npm install shared-structs

So we can allocate buffers in JavaScript and pass them to a C add-on and mutate them there. What if we could read/write the values from JavaScript as well? There is an npm module for that. It’s called shared-structs and it implements the offset/size calculations described above in the post.

It’s pretty straightforward to use. Simply pass the struct declaration to the JavaScript constructor and it’ll generate a JavaScript object exposing the same properties as the struct.

Plain Text

const sharedStructs = require(‘shared-structs’)</p><p>// parse the struct and generate a js object constructor
const structs = sharedStructs(`
  struct test {
    uint32_t an_unsigned_num;
    int32_t a_signed_num;
  };
`)</p><p>// allocate a new struct
const test = structs.test()</p><p>// use it as a normal js object
test.an_unsigned_num = -1
console.log(test.an_unsigned_num)

The shared-structs module use TypedArrays behind the scene to encode/decode the values. The way TypedArrays, such as a Uint32Array, encode values is the same way native code does it, which is useful when trying to implement structs in JavaScript.

To print out the underlying buffer that the struct is storing its data in access test.rawBuffer

Plain Text

console.log(test.rawBuffer)

In fact we can pass this buffer to C, to manipulate it there. Let’s update our previous example to use structs in JavaScript.

Plain Text

const addon = require(‘./build/Release/example.node’)
const sharedStructs = require(‘shared-structs’)</p><p>// parse the struct and generate a js object constructor
const structs = sharedStructs(`
  struct test {
    uint32_t an_unsigned_num;
    int32_t a_signed_num;
  };
`)</p><p>// allocate a new struct
const test = structs.test()</p><p>// Call the addon, but pass test.rawBuffer as the struct
addon.do_stuff(test.rawBuffer)</p><p>// Now we can read out the values in JavaScript as well!
console.log(test.an_unsigned_num)
console.log(test.a_signed_num)

We can also write the values in JavaScript before passing it to C.

Plain Text

// Write a value to the struct and call the add-on
test.a_signed_num = -100
addon.do_stuff(test.rawBuffer)</p><p>// Will print -101 since the addon decrements this property
console.log(test.a_signed_num)

Values that are impossible to represent in JavaScript such as pointers, are simply skipped in the JavaScript struct, but still stored in the buffer.

An interesting side-effect of parsing the structs in JavaScript is that it can reduce the amount of calls you need to do to your native add-on. Every call to a native add-on is much more expensive than calling a normal JavaScript function since it has to do a “boundary cross” between the JavaScript VM and the native code which requires certain checks. Reading the struct in JavaScript can be done inside the JavaScript VM and can therefore be much faster if you have to read/write properties a lot. This kind of memory mapping is actually used by Node.js core itself to speed up interactions between C++ and JavaScript.

The shared-structs module even works with nested structs, arrays, defines, constant expressions and more!

I hope you find it as useful as I have, in making it easier to write native add-ons in Node.js.

At NearForm, we help businesses implement and evolve modern application development technologies, processes and tools that support greater agility and speed to market. As one of the leading contributors to the Node.js community we have many years of experience in adopting Node.js in enterprise environments and improving the business outcomes for our global clients.

For more information about our consulting approach contact us or email us at info@nearform.com .

Insight, imagination and expertly engineered solutions to accelerate and sustain progress.

Contact