Solve the N+1 Problem Using DataLoader with Mercurius GraphQL
If you are using Fastify with Mercurius as the GraphQL adapter, you are probably looking for a solution to the N+1 problem. This article will show you how to solve it and speed up your GraphQL application.
If you are not using Fastify instead, you can read a Quick Start guide before reading this article.
What is the N+1 problem?
I must say that I could not find a TL;DR (Too Long; Didn’t Read) explanation of the N+1 problem to suggest for you to read before continuing. So, I will try to explain it with a quick code example that we will fix later in this article.
Let’s see the N+1 problem in action.
First of all, we need an application up & running. Create a gql-schema.js file that will contain a simple GQL Schema string:
Let’s connect the previous schema to a new app.js file, where we will implement a Fastify + Mercurius application.
We will use an in-memory database to store the mock data. You can find the SQL data used for this article in the source code on GitHub.
Great, we are ready to start our application by running the node app.js command. Thanks to the graphiql: true option, we can open the GraphiQL interface at http://localhost:3001/graphiql.
From the GraphiQL interface, we can run the following query by hitting the Play button:
So far, so good! You should see the server’s output on the right side of the GraphiQL interface. However, if we look at the server’s logs, we can see that the server has executed 4 SQL queries:
As you can see, the queries are not optimized because we ran a query to fetch the projects for each developer instead of fetching all the projects in a single query.
Now you have seen the N+1 problem in action:
- 1: we run a root query to fetch the first data list
- +N: we run a query for each item of the previous list to fetch the related data
So, if we had 100 developers, we would run 101 queries instead of 2! Now that we have seen the problem, let’s solve it.
How to solve the N+1 problem?
The most common way to solve the N+1 problem is to use DataLoaders. The DataLoader allows you to batch and cache the results of your queries and reuse them when necessary.
Mercurius offers you two ways to use DataLoader:
- Loader: it is a built-in DataLoader-Like solution that is quick to set up and use.
- DataLoader: it is the standard solution to N+1 problem.
In this article, we are going to see both solutions and compare them.
Mercurius Loader in action
The loader feature is a built-in DataLoader-Like solution that is quick to set up and use. It replaces Mercurius’ resolvers option.
Let’s see how to use it by optimizing the previous app.js example:
As you can see, we have replaced the resolvers.Developer.builtProjects function with the loaders one. The difference is that the loaders receive an array of queries (results from the parent query) instead of a single parent object. Mercurius will batch the queries and call the loader function only once.
In this new loader function, you can run a single query to fetch all the data you need and then you must return a positionally-matched array of results.
- It is quick to set up and use.
- It is not necessary to pollute the context.
- It is managed by Mercurius.
- Clear separation of concerns between the resolvers and the loaders.
- It is not possible to reuse the loader’s cache in other resolvers.
DataLoader in action
DataLoader is the standard solution to the N+1 problem. It was originally created by Facebook. Let’s see how we can integrate it into our application.
First, you should restore the app.js file removing the loaders configuration. Second, we need to install the dataloader package:
Finally, we must instantiate a DataLoader for each request, so we need to extend the context object:
In this new example, we have added the new projectsDataLoader object to the Mercurius context. This object is an instance of the DataLoader class that we have imported from the dataloader package.
The DataLoader class accepts a batchLoader function that will be called only once for each batch of queries. It supports different ways to accumulate the queries:
- Frame of execution: it is the default behavior. It accumulates the queries until the next tick. It is the same approach used by Mercurius Loader.
- Time frame: it accumulates the queries until the specified time frame.
The batchLoader function receives an array of keys as a single argument, and it must return an array of results positionally matching the input array. As you can see, it is the same approach used by Mercurius Loader.
- It is a standard defacto solution.
- Flexibility: it is possible to reuse the loader cache in other resolvers.
- Requires more code to set up and configure, you need to create your own context to access the database.
- The resolvers must be aware and use the DataLoader instance.
You have now learned how to use DataLoaders with Mercurius by exploring two different solutions to solve the N+1 problem. You may think that mixing the resolvers and the loaders could be a good idea. Surely it is doable, but you must turn off one of the two caches to avoid inconsistencies and it could be a bit confusing to manage.
If you have found this helpful, you may read other articles about Mercurius.
Now jump into the source code on GitHub and start to play with the GraphQL implemented in Fastify.
If you enjoyed this article comment, share and follow me on twitter!