Increasing the Performance of Node.js Applications
Guest post by Stefan Thies from Sematext. Thanks Stefan.
During the Node Interactive event in Amsterdam we had the pleasure of speaking with Matteo Collina from NearForm about the blazingly fast logger “Pino”.
In this post we’ll share our insights into Pino and the additional steps required to have a reliable and fast log management solution for Node.js applications.
Let’s first summarize Matteo’s talk: The tool named “Pino” was developed by NearForm to increase the performance of Node.js applications by making logging as fast as possible.
The idea is simple: the faster the logger the less CPU it uses, the more CPU time the app will have to serve e.g., HTTP requests. NearForm tuned Pino to be the fastest logger for Node.js, thus increasing throughput and reducing the response latency of their customers’ apps, which of course saves money for cloud computing resources!
Optimized JS code, e.g. for serialisation and string operations
Pino streams logs in line-delimited JSON format to any given output stream (default: standard output stream)
Minimalistic functionality in the Pino core
Pino has no functionality for log rotation or compression
Pino itself does not include transport layers or format templates. Transports are separate modules, running in a separate process. For example:pino-elasticsearch reads from standard input and inserts Pino-formatted logs into Elasticsearch, e.g.:node app.js | pino-elasticsearch -i log_index -H elasticsearch-server -p 9200
What makes Pino so easy to use?
The Pino API and output format are compatible with Bunyan (popular logging framework). With very little code modification Bunyan users can switch to Pino! Code Example:
var pino = require('pino')()
pino.error('this is at error level')
pino.info('the answer is %d', 42)
Separate tools for transports connected via Unix pipes:
Awesome, speeding up logging for Node.js web frameworks! But what needs to be done to use this in production?
The minimalistic approach of Pino requires some additional steps to be done. Running additional Node.js tasks in separate processes helps utilize the CPU resources much better. For example, Pino log events could be streamed to Elasticsearch with pino-elasticsearch using 2 separate processes.
If you don’t want to deal with maintaining your own Elasticsearch you can use Logsene- Hosted Elasticsearch for Logs and simply point pino-elasticsearch to it; just replace the IndexName with the token for your Logsene app.
This means there is no slowdown in the Node.js web server app bound to one CPU core (assuming no worker processes used), while the elasticsearch transport might be processed on the second core.
Pino-elasticsearch is designed for simplicity and plays well withunix pipes. Our tests did show that pino-elasticsearch uses nearly constant memory (<100 MB) and low CPU <5% during our tests – a low overhead to stream logs in Elasticsearch!
However, in production we might face more complex situations, e.g.:
Network connections to Elasticsearch might become slow or they might break
Elasticsearch might reject requests when it gets overloaded
To avoid dropped logs in either situation it is safer to persist logs locally first, in case of shipping failure, and track the status of log forwarding for later transmission.
Writing logs to standard output is “state-of-the-art” today because it fits the logging model used by Docker, Upstart and Systemd well.
That’s why the best practice is to combine apps using Pino with existing tools:
Upstart and Systemd to collect standard output and error stream and write them to files
Log rotate or journald to handle log file rotation
Log shippers to forward logs securely and reliably to central log storage and indexing service where search and analysis takes place.
Note that if your application is deployed in a Docker container there are other ways of handling logs, for example by Docker logging drivers or agents like Sematext Docker Agent – probably the easiest way to deal with this issue! But as not everybody moved to Docker in production yet let’s have a look at typical Linux deployments.
We checked several logging setups of a restify server and compared logging with Bunyan and Pino on a server with 4 CPU cores and 8 GB RAM. Then we added a log shipper to forward logs to Elasticsearch/Logsene to see the impact on the performance of the restify server.
Log shippers can stream our /var/log/upstart/my-api-service.log log file to Elasticsearch. Most of the popular log shippers can deal with rotating log files and queue logs in local buffers when Elasticsearch is not available.
Let’s use @sematext/logagent so we don’t need to install Ruby or Java on our Node.js server and because Logagent can deal with JSON logs in Pino/Bunyan format; also, because we want to make sure we don’t lose any logs if the destination is temporarily unreachable (Logagent will buffer logs locally and try to ship them later). Let’s also skip the Elasticsearch setup and just configure it to ship logs to Logsene, which is compatible with the Elasticsearch API:
sudo npm i -g @sematext/logagent
# setup with the index name / Logsene Token and a glob pattern to watch files
sudo logagent-setup LOGSENE_TOKEN ‘/var/log/**/*.log’
Done! Logagent runs as a Linux service, watches log files produced by Pino, and ships them reliably to Logsene, with local buffering and all.
If you prefer to use a different Elasticsearch server (e.g. your own) you must configure the Elasticsearch URL and index name in /etc/sematext/logagent.conf, e.g.:
Then restart the service with sudo service logagent restart. You might want to check logagents status messages in /var/log/upstart/logagent.log :)
As soon as logs arrive in ELK, we can use Kibana (if you are using Logsene, you can use Kibana that is integrated in it) to analyze Express/Hapi/Restify logs created by Pino:
Node.js Apps can easily benefit from Pino. Switching to Pino for logging will save precious CPU cycles typically spent on inefficient logging. The proper production deployment needs to include additional tasks like log rotation, and log forwarding to a centralized Log Management System to analyze the logs. Using the right tools and settings, a fast & reliable setup can be achieved quickly and easily.