Editor’s note: In 2016 NearForm Chief Software Architect, Matteo Collina, wrote a popular post about the Cost of Logging which introduced the newly released Pino logger to the world. 5 years, 238 releases, 8.6k stars in github and millions of downloads later, we felt it was time to update some of the information from that original post. Below you will find updated information and links to Pino resources.

The Cost of Logging in Modern Applications

Nowadays, all applications use logs to record everything that happens in a request/execution, however, choosing the right log library is essential if you are looking for high performance. In some cases, choosing an unoptimized library can reduce the application throughput by 80%!

This blog post is an updated version of “The cost of logging – 2016”.

Essentially, it is necessary to establish a baseline for any performance optimisation. It will serve as a comparison for the rest of the  benchmarks you will see in this post.

All the snippets were grouped in this repository.

Measuring Throughput

Consider a simple express application:

Copy to Clipboard

Baseline measurement without logging

In order to measure this, we will use autocannon, an HTTP/1.1 benchmarking tool inspired by wrk and built in Node.js:

Copy to Clipboard
autocannon benchmark showing 676k requests in 60.02s and 161MB of throughput

Logging with Bunyan

Let’s see what happens when we add logging. In all of the following examples, we will be redirecting the output from stdout to /dev/null:

Copy to Clipboard

We would expect this to have similar throughput to the first one. However, we got some surprising results:

Copy to Clipboard
logging benchmark with bunyan reading 235k requests in 60.02s, 55.9 MB read

Running a server using bunyan for logging will reduce throughput by almost 70%.

Logging with Winston

Winston yields slightly better results:

Copy to Clipboard
Copy to Clipboard
logging with winston benchmark that reads 361k requests in 60.02s, 85.9 MB read

Running a server using Winston (version 2) for logging will slow you down by almost 50%.

Introducing Pino

Our logger is Pino, which means pine in Italian. Matteo Collina, the author of Pino got the inspiration for the name from a pine tree in front of his house.

Pino is the fastest logger for Node.js (full benchmarks found here and here). While you can use Pino directly, you will likely use it with a Web framework, such as express, hapi or fastify. Pino is (99%) API-compliant with Bunyan and Bole.

The Anatomy of Pino

Pino is a fast logger built on several principles:

1) Low overhead. Using minimum resources for logging is very important. Log messages tend to get added over time and this can lead to a throttling effect on applications.

2) It only logs in JSON.

3) It needs to be as fast as possible.

4) It supports child loggers, to allow transactions to have their own metadata.

5) Drop-in compatible API with bunyan, bole, winston wherever possible.

Instantiating Pino is easy:

Copy to Clipboard

Pino was originally created at NearForm (see the old blog post for further details) and it was designed to be a fast JSON logger.

Using Pino with Express

Let’s consider the following example using express-pino-logger:

Copy to Clipboard

We can run the same benchmark as before:

Copy to Clipboard
logging benchmark with pino and express reading 492k requests in 60.02s, 117 MB read

Using express-pino-logger increases the throughput of express-winston by almost 40%.

Asynchronous Logging

Most of the logging libraries available define synchronous logging as the standard mode. The messages are directly written to the output stream as they are generated with a blocking operation.

Asynchronous logging works by buffering the log messages and writing them in larger chunks.

This can improve your application performance by executing the I/O operations in a different context. Asynchronous logging has consistently lower latency than synchronous loggers.

In the Pino.js Benchmarks you can see the comparison between both approaches:

The following values show the time spent to call each function 100.000 times.

Pino Average: 246.336ms

PinoAsync Average: 129.507ms

Asynchronous and Synchronous Logging Benchmarks with Fastify

To bring this information to reality, we can use Fastify, which comes bundled with Pino by default, as a benchmark example between asynchronous and synchronous logging on HTTP servers.

Copy to Clipboard

Synchronous logging:

synchronous logging benchmark with fastify and pino reading 1766k requests in 60.02s, 309 MB read

Asynchronous logging:

asynchronous logging benchmark with fastify and pino reading 2110k requests in 60.02s, 369 MB read

We get around 20% more throughput when using Asynchronous logging.

A Real-world Logging Scenario

Nonetheless, a raw benchmark would not show the importance of the async approach. For that reason, let’s move it to a real-world scenario:

Let’s assume that your application is serving a huge quantity of requests, and each one of those requests is logged at least twice (start request and end request).

Limitations of the Synchronous Approach

In the synchronous approach, the logging waits for the write to complete before executing the next operation. This could potentially compromise the performance of your application.

example of blocked application when running pino in synchronous mode

This is the most common issue when dealing with synchronous operations, but, let’s look at one more complicated issue.

The EAGAIN Error

Usually, those logs are redirected to a file descriptor and in that situation, it is not uncommon that the memory buffer reserved for the file descriptor becomes full.

What usually happens in that situation?

The EAGAIN error is emitted by the operating system and it means the resource (file descriptor) is not available now and prompts users to try again later.

Example memory allocation of file descriptor in the memory heap when there is available memory for writingExample when EAGAIN is thrown by system operational which blocks the I/O operation on synchronous logging

When dealing with synchronous communication, it will lead to a blocking operation and obviously a bottleneck.

Flow illustration when EAGAIN is thrown to a sync log. It makes the application wait for X milliseconds until the buffer memory is free

Avoiding Bottlenecks with pino async

In this situation, the pino async can avoid major headaches, because it will not block the I/O when an error occurs, instead, it will keep the log in the buffer until the file descriptor is ready to write.

Flow illustration when EAGAIN is thrown to an async log the application doesn’t need to wait until the buffer memory is free, instead, it stores the information to a memory buffer

It is important to understand that if the file descriptor takes too much time to free, the buffer could grow the application memory until an out-of-memory is thrown.

For that reason, Pino allows you to manage the buffer queue through settings in `sonic-boom` (retryEAGAIN) and you can limit the queue to avoid out-of-memory through the maxLength option.

Hence, in a real-world comparison, asynchronous logging might be essential to your architecture.

Besides the above explanation, I strongly recommend checking the “Welcome to Pino@7.0.0” blog post. It mainly explains the `pino.transport()` feature (necessary to work with asynchronous logging).

The Pino family

Pino has a set of companion tools, each of which has been meticulously implemented to have a minimal impact on performance:

  • sonic-boom: Extremely fast utf8 only stream implementation
  • thread-stream: A streaming way to send data to a Node.js Worker Thread.
  • fast-redact: Pino’s redaction functionality is built on top of fast-redact.
  • fastify: The Fastify web framework comes bundled with Pino by default.
  • safe-stable-stringify: Safe, deterministic and fast serialization alternative to JSON.stringify.

The Pino Ecosystem

For further information check the Pino Ecosystem.

Pino is scalable

Pino is the fastest logger in town (see the benchmarks), and it uses a full set of performance optimisations to achieve this goal, from avoiding JSON.stringify, to rolling our own printf style formatting module.

These techniques reduce the overhead of logging by 40%.

In extreme mode, we also delay flushing to disk, to reduce it even more, by 60%.

We have adopted Pino in some projects, and it has increased the throughput (req/s) of those applications by a factor of 20-100%. Stay tuned for more blog posts on the techniques that Pino uses to be so fast, as each of those has its own story.

Editor’s note: Matteo Collina also contributed code review, examples and guidance on this post.

Don’t miss a beat

Get all the latest NearForm news,
from technology to design.
View all posts  |  Technology  |  Business  |  Culture  |  Opinion  |  Design
Follow us for more information on this and other topics.