Sematext guest post: Pino – The Fastest Node.js Logger for Production

By: Guest Author

Guest post by Stefan Thies from Sematext. Thanks Stefan.

During the Node Interactive event in Amsterdam we had the pleasure of speaking with Matteo Collina from nearForm about the blazingly fast logger “Pino”. In this post we’ll share our insights into Pino and the additional steps required to have a reliable and fast log management solution for Node.js applications.

Let’s first summarize Matteo’s talk: The tool named “Pino” was developed by nearForm to increase the performance of Node.js applications by making logging as fast as possible. The idea is simple: the faster the logger the less CPU it uses, the more CPU time the app will have to serve e.g., HTTP requests. nearForm tuned Pino to be the fastest logger for Node.js, thus increasing throughput and reducing the response latency of their customers’ apps, which of course saves money for cloud computing resources!

Performance of Node.js logging frameworks.  Source: The cost of Logging

Performance of Node.js logging frameworks.
Source: The cost of Logging

Why is Pino so fast?

    • Optimized JS code, e.g. for serialisation and string operations
    • Pino streams logs in line-delimited JSON format to any given output stream (default: standard output stream)
    • Minimalistic functionality in the Pino core
      • Pino has no functionality for log rotation or compression
      • Pino itself does not include transport layers or format templates. Transports are separate modules, running in a separate process. For example:pino-elasticsearch reads from standard input and inserts Pino-formatted logs into Elasticsearch, e.g.:
        node app.js | pino-elasticsearch -i log_index -H elasticsearch-server -p 9200

What makes Pino so easy to use?

    • The Pino API and output format are compatible with Bunyan (popular logging framework). With very little code modification Bunyan users can switch to Pino! Code Example:
      var pino = require('pino')()
      pino.info('hello world')
      pino.error('this is at error level')
      pino.info('the answer is %d', 42)
    • Separate tools for transports connected via Unix pipes:
    • Plugins for the most popular web frameworks are available, logging request details including response times:

Awesome, speeding up logging for Node.js web frameworks! But what needs to be done to use this in production?
The minimalistic approach of Pino requires some additional steps to be done. Running additional Node.js tasks in separate processes helps utilize the CPU resources much better. For example, Pino log events could be streamed to Elasticsearch with pino-elasticsearch using 2 separate processes. If you don’t want to deal with maintaining your own Elasticsearch you can use Logsene- Hosted Elasticsearch for Logs and simply point pino-elasticsearch to it; just replace the IndexName with the token for your Logsene app.

node my-api-service.js | pino-elasticsearch --index IndexName -H https://logsene-receiver.sematext.com -p 443 -b 1000

This means there is no slowdown in the Node.js web server app bound to one CPU core (assuming no worker processed used), while the elasticsearch transport might be processed on the second core.

Pino-elasticsearch is designed for simplicity and plays well withunix pipes. Our tests did show thatpino-elasticsearch uses nearly constant memory (<100 MB) and low CPU <5% during our tests – a low overhead to stream logs in Elasticsearch!

However, in production we might face more complex situations, e.g.:
1. Network connections to Elasticsearch might become slow or they might break
2. Elasticsearch might reject requests when it gets overloaded

To avoid dropped logs in either situation it is safer to, in case of shipping failure, persist logs locally first and track the status of log forwarding for later transmission.

Writing logs to standard output is the “state-of-the-art” today because it fits well the logging model used by Docker, Upstart and Systemd. However, a far more reliable alternative is to, in case of shipping failure, persist logs locally first and track the status of log forwarding for later transmission to log management services. That’s why the best practice is to combine apps using Pino with existing tools:
1. Upstart and Systemd to collect standard output and error stream and write them to files
2. Log rotate or journald to handle log file rotation
3. Log shippers to forward logs securely and reliably to central log storage and indexing service where search and analysis takes place.

Note that if your application is deployed in a Docker container there are other ways of handling logs, for example by Docker logging drivers or agents like Sematext Docker Agent – probably the easiest way to deal with this issue! But as not everybody moved to Docker in production yet let’s have a look at typical Linux deployments.

We checked several logging setups of a restify server and compared logging with Bunyan and Pino on a server with 4 CPU cores and 8 GB RAM. Then we added a log shipper to forward logs to Elasticsearch/Logsene to see the impact on the performance of the restify server. The chart below shows that using restify-pino-logger instead of restify-bunyan-logger increased the number of HTTP requests per second by more than 27%. Adding a reliable log shipper in parallel, which used the second CPU core, slowed down the restify server by only 3%. This means a performance gain of 24% for the restify server itself (vs. bunyan logger) and having a reliable log shipper setup in parallel.

HTTP requests

Note: The move from Bunyan to Pino is really easy, as Pino API is compatible with Bunyan. We used the restify servers from https://github.com/pinojs/restify-pino-logger/tree/master/benchmarks for our tests.

 

Example: Linux deployment

Let’s walk through a complete Linux deployment of a simple Node.js service, one that uses the Pino Logger (API), upstart, logrotate and logagent to ship logs into Elasticsearch/Kibana.

Upstart service file for the Node.js server application:
We assume Node.js app logs with Pino to standard output/error stream, e.g. my-api-service/app.js:

var restify = require('restify')
var server = restify.createServer({name: 'app'})
server.use(require('restify-pino-logger')())
server.get('/', function (req, res) {
  res.send('hello world')
})

Upstart collects logs in /var/log/upstart/serviceName.log – a typical upstart config file looks like this:

#/etc/init/my-api-service.conf
start on filesystem and net-device-up IFACE!=lo
stop on runlevel[016]
respawn
respawn limit 5 60
kill timeout 30
setuid nobody
setgid nogroup
limit nofile 32768 32768
# Token for SPM Node.js Monitoring - https://sematext.com/spm/integrations/nodejs-monitoring
env SPM_TOKEN=XXXX 
env NODE_MONITORING="-r /usr/lib/node_modules/spm-agent-nodejs"
env NODE_ENV=production
env WEB_CONCURRENCY=2
env PORT=80
env NODE_OPTIONS="--max-old-space-size=250"
env APP_JS_FILE=/usr/lib/node_modules/my-api-service/app.js
chdir /tmp
exec /usr/bin/node $NODE_MONITORING $NODE_OPTIONS $APP_JS_FILE

The service needs to be enabled:
sudo initctl reload-configuration
sudo start my-api-service

Logrotate
Logrotate is a Linux service typically run once a day to rename, compress and delete old log files, while keeping the last N log files.

By default logrotate is configured to rotate files weekly:

> cat /etc/logrotate.conf 
# see "man logrotate" for details
# rotate log files weekly
weekly
# ...
# packages drop log rotation information into this directory
include /etc/logrotate.d

If you like to rotate logs daily (or by size) for the new service, then you have to create a new file in /etc/logrotate.d to define service specific rules:

/var/log/upstart/my-api-service {
     daily
    missingok
    rotate 8
    compress
    delaycompress
    notifempty
}

Log shipper setup for Elasticsearch
Log shippers can stream our /var/log/upstart/my-api-service.log log file to Elasticsearch. Most of the popular log shippers can deal with rotating log files and queue logs in local buffers when Elasticsearch is not available. Let’s use @sematext/logagent so we don’t need to install Ruby or Java on our Node.js server and because Logagent can deal with JSON logs in Pino/Bunyan format, plus because we want to make sure we don’t lose any logs if the destination is temporarily not reachable (Logagent will buffer logs locally and try to ship them later). Let’s also skip the Elasticsearch setup and just configure it to ship logs to Logsene, which is compatible with the Elasticsearch API:

sudo npm i -g @sematext/logagent
# setup with the index name / Logsene Token and a glob pattern to watch files
sudo logagent-setup LOGSENE_TOKEN ‘/var/log/**/*.log’

Done! Logagent runs as a Linux service, watches log files produced by Pino, and ships them reliably to Logsene, with local buffering and all.

If you prefer to use a different Elasticsearch server (e.g. your own) you must configure the Elasticsearch URL and index name in /etc/sematext/logagent.conf, e.g.:

output:
  elasticsearch:
    url: http://elasticsearch-server:9200
    index: my_api_service_logs

Then restart the service with sudo service logagent restart. You might want to check logagents status messages in /var/log/upstart/logagent.log :)

As soon as logs arrive in ELK, we can use Kibana (if you are using Logsene, you can use Kibana that is integrated in it) to analyze Express/Hapi/Restify logs created by Pino:

Wrap up …

Node.js Apps can easily benefit from Pino. Switching to Pino for logging will save precious CPU cycles typically spent on inefficient logging. The proper production deployment needs to include additional tasks like log rotation, and log forwarding to a centralized Log Management System to analyze the logs. Using the right tools and settings, a fast & reliable setup can be achieved quickly and easily.