Tuning Node.js app performance with Autocannon and 0x
By Ron Litzenberger

Featuring two open source tools developed by Matteo Collina and David Mark Clements of nearForm.

Performance improvements either on a database or a web app requires good tools. They should be easy to use and help you identify the issue quickly. In this post, I am going to introduce two tools developed at nearForm that do just that to help improve your Web App. Then we will troubleshoot a RESTful API.

The Tools

autocannon

Autocannon is an HTTP/1.1 benchmarking tool developed by our very own Matteo Collina. It is written in NodeJS and can work on Linux, Mac, and Windows. It supports HTTP and HTTPS.

autocannon

To install globally run:

npm -i autocannon -g

autocannon

0x ( pronounced “zero ex” ) was written by our very own David Mark Clements.It will profile and generate an interactive flame graph for a Node process in a single command. It works on Linux and Mac

autocannon

To install globally:

npm -i 0x -g

Node Version

In this post, we are going to target Node version 6. Using Node Version Manager, run the commands below:

nvm install v6
nvm use v6

autocannon

Setup App

We are going to use our Slow Rest API to demo both autocannon and 0x. From our previous post, Building real-time dashboard, we know there is a bottleneck based on the Kibana dashboard below, and in this post, we’ll dive into what we can find.

kibana

To setup:

git clone https://github.com/nearform/slow-rest-api.git
cd slow-rest-api
node index

In another terminal: npm test

autocannon

Generating a flame graph

We need to make sure we’re using either Linux or Mac and we have super user access since it uses dtrace or perf.0x index (0x may ask us for sudo access, this is necessary since the underlying tracing tool (dtrace on Mac or perf on Linux) requires root access)In another terminal, as soon as possible we run: npm testWe can now hit CTRL-C in the first terminal where we ran 0x.The 0x process will begin to generate the flame graph in the form of an HTML file and then provide the link to that file. When we copy this link into a browser we should see a flame graph similar to the following:

autocannon

Reading Flame Graphs:

Each box in the graph represents a function and the different colors represents the amount of time the function is on CPU. A dark red box is a hot function which represents more time was observed on top of stack which would be blocking the event loop. We will then follow the child functions below to see the code path that was taken.

Troubleshoot a performance issue:

autocannon

When we run npm test, we are using autocannon to add load to the server. The script looks like so: autocannon -d 20 -c 100 http://localhost:3000/a & autocannon -d 20 -c 100 http://localhost:3000/b & autocannon -d 20 -c 100 http://localhost:3000/c

From our demo above, we see the request per sec is at 252. Let see if we can improve the Req/Sec.

autocannon

Now let’s take a look at the frame graph. The reddest box at the top points to etag.js line 23 which is reading through a cache object using the Array forEach method:

autocannon

We can use the flame graphs to follow the code path. In the above graph, we see the route http://localhost:3000/a from our autocannon is the slowest. So from here we can troubleshoot that route and follow it to determine our to fix the bottleneck.

Since the forEach is a hot point in our flame graph, let’s rewrite it using a for loop see if that improves performance.

Let’s rewrite the loop as follows:

autocannon

After running autocannon on the change, we see our Request have improved to 386 on average and throughput 507 kb per sec is which is better but still not great.

autocannon

And this is our flame graph:

autocannon

While this is better, we still see that request to the /a route is still “hot”, causing a bottleneck in the app.

Following the code path we also see the Buffer.from and Buffer.toString are also red. Let’s try to remove the cache and see what happens.

Now when we run autocannon, we get a big improvement on Request per second is at 1867 on average compared to initial run of 254. Our throughput is 2.59 MB per second on average.

autocannon

And our flame graph looks like so:

autocannon

By removing the etagCache check and generating the etag each time, we have shown this is better for performance. The reason is due to the synchronous forEach, for loop, and using Buffer.from and Buffer.toString() in our cache object which all showed up as red in our flame graph. We are also reading in a large JSON file each time we check the cache. It is an unnecessary optimization that did not take into account the size of the cache could become later on usually in production causing a bottleneck. Typically, implementing cache gives your application better performance but since this was poorly implemented, using caching slowed down the performance and the fix was to remove the cache. In our case, the poorly implemented cache caused the bottleneck. It is also worth noting that adding a cache is ideal, it’s better to go higher on the stack using something like Nginx which specializes in caching.

We have shown how to use autocannon to easily add load to your app and 0x to visualize the stack trace to identify potential bottlenecks in the event loop. These easy to use tools can help you measure your app performance and visually find bottlenecks.

nearForm Open Source Repos:

AutoCannon: https://github.com/mcollina/autocannon0x: https://github.com/davidmarkclements/0xSlow Rest API: https://github.com/nearform/slow-rest-api

Subscribe to our monthly newsletter!
join the discussion