Increasing the Performance of Node.js Applications
Guest post by Stefan Thies from Sematext. Thanks Stefan.
During the Node Interactive event in Amsterdam we had the pleasure of speaking with Matteo Collina from NearForm about the blazingly fast logger “Pino”.
In this post we’ll share our insights into Pino and the additional steps required to have a reliable and fast log management solution for Node.js applications.
Let’s first summarize Matteo’s talk: The tool named “Pino” was developed by NearForm to increase the performance of Node.js applications by making logging as fast as possible.
The idea is simple: the faster the logger the less CPU it uses, the more CPU time the app will have to serve e.g., HTTP requests. NearForm tuned Pino to be the fastest logger for Node.js, thus increasing throughput and reducing the response latency of their customers’ apps, which of course saves money for cloud computing resources!
Why is Pino so fast?
- Optimized JS code, e.g. for serialisation and string operations
- Pino streams logs in line-delimited JSON format to any given output stream (default: standard output stream)
- Minimalistic functionality in the Pino core
- Pino has no functionality for log rotation or compression
- Pino itself does not include transport layers or format templates. Transports are separate modules, running in a separate process. For example:pino-elasticsearch reads from standard input and inserts Pino-formatted logs into Elasticsearch, e.g.:
node app.js | pino-elasticsearch -i log_index -H elasticsearch-server -p 9200
What makes Pino so easy to use?
- The Pino API and output format are compatible with Bunyan (popular logging framework). With very little code modification Bunyan users can switch to Pino!
- Separate tools for transports connected via Unix pipes:
- Plugins for the most popular web frameworks are available, logging request details including response times:
Awesome, speeding up logging for Node.js web frameworks! But what needs to be done to use this in production?
The minimalistic approach of Pino requires some additional steps to be done. Running additional Node.js tasks in separate processes helps utilize the CPU resources much better. For example, Pino log events could be streamed to Elasticsearch with pino-elasticsearch using 2 separate processes.
If you don’t want to deal with maintaining your own Elasticsearch you can use Logsene- Hosted Elasticsearch for Logs and simply point pino-elasticsearch to it; just replace the IndexName with the token for your Logsene app.
This means there is no slowdown in the Node.js web server app bound to one CPU core (assuming no worker processes used), while the elasticsearch transport might be processed on the second core.
Pino-elasticsearch is designed for simplicity and plays well withunix pipes. Our tests did show that pino-elasticsearch uses nearly constant memory (<100 MB) and low CPU <5% during our tests – a low overhead to stream logs in Elasticsearch!
However, in production we might face more complex situations, e.g.:
- Network connections to Elasticsearch might become slow or they might break
- Elasticsearch might reject requests when it gets overloaded
To avoid dropped logs in either situation it is safer to persist logs locally first, in case of shipping failure, and track the status of log forwarding for later transmission.
Writing logs to standard output is “state-of-the-art” today because it fits the logging model used by Docker, Upstart and Systemd well.
That’s why the best practice is to combine apps using Pino with existing tools:
- Upstart and Systemd to collect standard output and error stream and write them to files
- Log rotate or journald to handle log file rotation
- Log shippers to forward logs securely and reliably to central log storage and indexing service where search and analysis takes place.
Note that if your application is deployed in a Docker container there are other ways of handling logs, for example by Docker logging drivers or agents like Sematext Docker Agent – probably the easiest way to deal with this issue! But as not everybody moved to Docker in production yet let’s have a look at typical Linux deployments.
We checked several logging setups of a restify server and compared logging with Bunyan and Pino on a server with 4 CPU cores and 8 GB RAM. Then we added a log shipper to forward logs to Elasticsearch/Logsene to see the impact on the performance of the restify server.
The chart below shows that using restify-pino-logger instead of restify-bunyan-logger increased the number of HTTP requests per second by more than 27%.
Adding a reliable log shipper in parallel, which used the second CPU core, slowed down the restify server by only 3%.
This means a performance gain of 24% for the restify server itself (vs. bunyan logger) and having a reliable log shipper setup in parallel.
Note: The move from Bunyan to Pino is really easy, as Pino API is compatible with Bunyan.
We used the restify servers from https://github.com/pinojs/restify-pino-logger/tree/master/benchmarks for our tests.
EXAMPLE: LINUX DEPLOYMENT
Let’s walk through a complete Linux deployment of a simple Node.js service. One that uses the Pino Logger (API), upstart, logrotate and logagent to ship logs into Elasticsearch/Kibana.
Upstart service file for the Node.js server application:
We assume Node.js app logs with Pino to standard output/error stream, e.g.
Upstart collects logs in
/var/log/upstart/serviceName.log – a typical upstart config file looks like this:
The service needs to be enabled:
Logrotate is a Linux service typically run once a day to rename, compress and delete old log files, while keeping the last N log files.
By default logrotate is configured to rotate files weekly:
If you like to rotate logs daily (or by size) for the new service, then you have to create a new file in
/etc/logrotate.d to define service specific rules:
Log shipper setup for Elasticsearch
Log shippers can stream our
/var/log/upstart/my-api-service.log log file to Elasticsearch. Most of the popular log shippers can deal with rotating log files and queue logs in local buffers when Elasticsearch is not available.
Let’s use @sematext/logagent so we don’t need to install Ruby or Java on our Node.js server and because Logagent can deal with JSON logs in Pino/Bunyan format; also, because we want to make sure we don’t lose any logs if the destination is temporarily unreachable (Logagent will buffer logs locally and try to ship them later). Let’s also skip the Elasticsearch setup and just configure it to ship logs to Logsene, which is compatible with the Elasticsearch API:
Done! Logagent runs as a Linux service, watches log files produced by Pino, and ships them reliably to Logsene, with local buffering and all.
If you prefer to use a different Elasticsearch server (e.g. your own) you must configure the Elasticsearch URL and index name in /etc/sematext/logagent.conf, e.g.:
Then restart the service with
sudo service logagent restart. You might want to check logagents status messages in
As soon as logs arrive in ELK, we can use Kibana (if you are using Logsene, you can use Kibana that is integrated in it) to analyze Express/Hapi/Restify logs created by Pino:
Node.js Apps can easily benefit from Pino. Switching to Pino for logging will save precious CPU cycles typically spent on inefficient logging. The proper production deployment needs to include additional tasks like log rotation, and log forwarding to a centralized Log Management System to analyze the logs. Using the right tools and settings, a fast & reliable setup can be achieved quickly and easily.
More information on Pino