Terraform management at scale: Terragrunt or Terraspace?
Infrastructure as code has become a standard in the IT industry over the past several years, especially within highly dynamic cloud environments. Among different tools (either general purpose or proprietary ones dedicated to working with a given public cloud provider) Terraform is one of the most widely used.
Thanks to its friendly learning curve, simple but powerful syntax and extensibility (Go knowledge and API is just enough to develop custom providers if a broad set of ready-to-go ones is still not enough) it is a common choice for more and more teams.
The rich ecosystem of supporting tools like
infracost, among others, allows developers to maintain a high velocity without sacrificing the highest code quality standards.
When Terraform Meets Wall
Unfortunately Terraform by itself is not a silver bullet and has some flaws too. Some of them are more painful than others, but adding them all up makes the development and maintenance of a pure Terraform project harder.
The most important issue is management of complex architectures built from multiple stacks (business-oriented units of deployment constructed from low-level and reusable modules) and deployed across multiple environments. Unfortunately trying to achieve that with pure Terraform ends up with code duplication within the project source code.
Also, there is no convenient way to execute commands against multiple stacks at once which leads us to cumbersome deployment scripts traversing the project directory tree in the strictly defined order.
Another flaw is Terraform’s backend management. Keeping the state locally is good for proofs of concepts but enterprise-grade projects with multiple team members collaborating make a
remote backend with a locking mechanism a must-have.
Unfortunately, Terraform itself does not offer a convenient way to create them (i.e. AWS: S3 bucket [state management] and DynamoDB: table [locks]) automatically during project init and requires a developer to either create them manually or with another infrastructure as code stack managed either by shell scripts or other tools.
Last but not least with complex infrastructure projects involving multiple stacks dependency management between them becomes painful. Apply order must be managed manually and the values must be passed either via a
terraform_remote_state block or using any kind of parameter store (producer stack stores output there and makes it available for further reads from consumer stacks).
A lack of CLI commands for working with multiple stacks at once requires a developer to traverse the project source code and run the commands within particular directories in particular order making deployment parallelization a challenging task.
The Ones Offering a Helping Hand
Now that we’ve defined some of the flaws, let’s take a look at potential remedies.
Quick Google research gives us two potential answers for our challenges: one of them is Terragrunt the other Terraspace.
Terragrunt is a fairly mature tool, first released in 2016 and backed by GruntWork. It’s just a wrapper on the Terraform binary itself which only focuses on solving the problems defined in the previous section similar to a pure Terraform experience. It does not force anything on the developer but offers a set of recommendations and features that makes day-to-day life a lot easier.
On the other hand, Terraspace is a new kid on the block. Development began in 2020 and it is backed by the BoltOps company. Terraspace describes itself as a “Terraform framework” which is far more than just a wrapper.
This definition fits nicely. Terraspace offers both a convenient way to work with complex infrastructure projects and a strictly defined project structure and richer set of extensions thanks to the tight integration with the Ruby language too.
Let’s get ready to rumble
Having two options on the table, let’s define a couple of aspects we would like to compare and take an in-depth look at each.
The following table presents the selected properties and characteristics with a brief description of what both Terragrunt and Terraspace offer within those categories.
|Automated project creation (directories and backing resources)
|Project directories structure
||Not enforced, recommendations available
||Enforced by the tool itself
|Multiple environments handling
||Multiple directories for different environments (each stack defined separately for a given environment)
||Controlled by setting up the relevant environment variable and dedicated variables files; stacks definitions not duplicated for different environments
|Local/global variables handling
||Variables defined on different project levels and imported when needed
||Variables defined on either global or stack level, resolved using layering mechanism without explicit imports
|Working with multiple stacks and handling dependencies
||Dedicated blocks for dependencies definitions and mock possibilities
||Ruby expressions defining dependencies (e.g: outputs usage) with mocking possibilities
|External/3rd party modules handling
module source syntax
module source syntax, Additionally (and completely optional) one can define a
Terrafile file in order to make module use consistent across stacks
||Not built-in, Terratest is a recommendation for writing test cases
||Integrated testing capabilities based on Ruby’s RSpec
|Extensions and hooks
||Available before/after Terraform commands
||Multiple hooks on different levels and custom extensions based on Ruby code
|Debugging of generated Terraform code
||Possible by verifying
|Possible by verifying
Deep Dive in Detail
Let’s focus on the details and take a deeper look under the hood of each of the aforementioned aspects in the table above.
Automated Project Creation (directories and backing resources)
Both of the tools offer a convenient way to initialize state management backing resources.
Terragrunt is configured within
terragrunt.hcl and customized with a really basic set of helper functions.
Terraspace uses regular
*.tf files placed inside
/config directory and offers a richer group of functions/placeholders that can later be dynamically resolved by Ruby’s templating engine.
Project Directories Structure
This is the first aspect from the above list that sees the tools follow completely different philosophies.
Terragrunt does not require any strict project structure. It offers recommendations for how the code should be grouped on different abstraction layers but does not force anything on the team:
- Low-level modules with reusable components without any business logic, should be stored within a separate Git repository (to allow independent release processes) and written in pure Terraform.
- Stacks constructed using not only “raw” resources but also earlier defined modules should be written in pure Terraform and represent different layers or components of the system’s architecture (e.g: backend and frontend layers as separate stacks). For the same reason as modules, stacks should be defined within a separate Git repository.
- There should also be a “live” repository where stacks are combined with each other to define business-oriented systems. This is the layer where Terragrunt enters (with its
*.hcl files) to ease the configuration, management and deployment processes.
This level of elasticity can be treated as both advantageous (highly customizable flows) and problematic (each project might be constructed in a slightly different way, making it hard to accommodate) at the same time.
On the other hand, Terraspace enforces a very strict and predefined directory structure. It follows the same module and stacks approach but is focused on keeping everything in a single Git repository (although it’s not a strict requirement). This way all Terraspace projects are pretty similar and easy to understand. All the building blocks and configurations have predictable locations which make things easier to work with.
The source code that can be found in a Terraspace project is a mix of Terraform (
*.tfvars files) and Ruby’s
*.rb files (mostly configuration, custom extensions and tests)
Multiple Environments Handling
Terragrunt expects multiple directories to be created for different environments. In order to avoid duplication, the configuration within the recommended
_env directory should contain all common aspects of the stacks used to create an environment (e.g: source of the stack, common variables, etc.). This file can be included later on within the
terragrunt.hcl file of a particular stack deployed on a particular environment.
The CLI commands have to be executed in each environment root directory to point to the one we would like to currently work with.
On the other hand, Terraspace does not require a developer to reflect different environments with the explicit directories maintaining a specific stack configuration (it’s all about the
/app folder and the stacks set defined within there).
The way it manages desired environment selection is the
TS_ENV variable preceding each Terraspace command.
To customize a stack configuration the
tfvars directory should be used where a file dedicated to a given environment (defined by
TS_ENV value) exists and defines variables required by the Terraform code (e.g:
Local/global variables handling
Terragrunt allows a developer to declare variables on the different levels of the directories structure in order to avoid code duplications. It can be problematic to find inputs for the stack (because different variables can be set on different levels of the directories’ hierarchies) but at the end of the day, this approach is predictable and easy to get comfortable with.
*.hcl files can be imported using the
include function while specifying the path. With this approach
inputs will be propagated automatically.
When it comes to
locals, the propagation
expose attribute has to be explicitly set to make them visible.
Terraspace is based on the
*.tfvars files usage when it comes to variable definition. Global variables can be defined within
config/terraform/globals.auto.tfvars (only file location is relevant, the filename is not) and all the others are defined within stacks
Terraspace uses the concept of layering. The
tfvars/base.tfvars are used across all environments for the given stack. If there is a need to override some variables for a given environment, this can be done by adding them within the
Besides regular literals, a set of Ruby helper functions and expandable templates can be used for values of variables. This allows for a lot of flexibility when it comes to defining global variables, which can change their values depending on the environment, without code duplication.
Working with multiple stacks and handling dependencies
In Terragrunt there is a convenient
run-all <action> command responsible for performing execution against multiple stacks at once:
Dependencies across multiple stacks are declared using
dependencies blocks within the stack’s
terragrunt.hcl file. The most important part of a dependency definition is the
config_path attribute where the stack which is depended on is defined.
To mock outputs not available when the command is run (e.g: first
plan execution) one can use a combination of
Terraspace offers a pretty similar set of convenient commands which allows a developer to interact with multiple stacks at once:
Dependencies across multiple stacks (on the output level) can be defined using Ruby’s templating (
common-tags in the example below is the name of a different stack and
tags is the name of one of the outputs):
To mock outputs not available when the command is run (e.g: first plan execution) one can use
mock attribute in the dependent output definition.
External/3rd Party Modules Handling
Terragrunt uses the same syntax as modules sources in pure Terraform. External modules are downloaded to
.terragrunt_cache directory to prevent further unnecessary downloads.
On the other hand, Terraspace offers a mechanism allowing a developer to declare an external module usage within the
Terrafile file. External/3rd party modules are downloaded (using
terraspace build command) to the local
Once they are in place they can be put under a project’s version control (if that’s the team’s will) and used like any other locally defined modules. There is no distinction between 3rd party and locally defined in the
app/modules directory when it comes to usage. What’s more all of them are referenced from the stack using the same
../../modules/<module_name> paths (the framework handles whether they should be searched for within
vendor/modules on its own).
At first glance, this behavior might seem unintrusive, but it causes issues with code completion and IDE support (module paths don’t refer to the exact location of the vendor modules therefore they’re considered as missing and attributes suggestions just don’t work). On the other hand, having them downloaded locally means the team doesn’t have to worry about their future existence in public registries.
Terragrunt itself does not offer testing capabilities at all. Because it’s just a Terraform wrapper. The external (although developed by the same company) Terratest tool must be integrated into the project in order to verify the code in an automated manner. This Go library offers integration with not only pure
terraform, but also
Terratest brings a rich set of helper functions that helps us not only control the lifecycle of the infrastructure being a subject of test (
destroy) but also a rich set of assertions and helper methods focusing on created resources (e.g: check whether an S3 bucket was created with a proper name). If that’s not enough, it’s also possible to integrate a cloud provider SDK which can be used to verify other details.
There is no strict requirement on where the test code should be located but standard practice is to create a common
test directory within
stacks and put all the test cases there. Such a setup makes it possible to run all the tests using a single
go test call.
On the other hand, Terraspace offers testing capabilities out of the box. The implementation relies on one of the most popular Ruby testing frameworks called
Testing classes can be generated using the convenient CLI generators (
terraspace new test <name> --type <type>) and their location is strictly defined (within the directory of a module or stack being a subject of tests). A major downside is that it does not offer a way to run multiple tests at once and each has to be executed separately.
Besides the test execution inconvenience, the RSpec testing framework offers a set of helper functions that makes Terraform configuration and lifecycle control easier. Although there is no out-of-the-box assertion kit checking created resources there’s nothing to stop one from integrating a cloud provider SDK within Ruby code and using it to perform verifications.
Extensions and hooks
Both Terragrunt and Terraspace offer a concept of hooks that allows developers to react to particular events or commands. Additionally, Terraspace allows for extensions to be written in Ruby and later on used in the
Terragrunt, as just a Terraform wrapper, only offers one way to run a particular script before/after a particular Terraform command (
apply etc.). Because the code is actually executed within the temporary directory. In
.terragrunt-cache a set of helper functions must be used in order for it to work with the directories structure in a predictable manner.
On the other hand, the Terraspace framework offers a much richer set of possibilities. Besides the hooks reacting before/after Terraform command execution, custom functions written in Ruby can be implemented and integrated with
This gives a developer a lot of flexibility and customizability in code generation. Unfortunately mixing HCL with Ruby is not recognized by any IDE yet so any syntax completion features won’t work smoothly.
Debugging of generated Terraform code
Both of the tools offer a way to verify and check the generated Terraform code.
For Terragrunt the pure configuration can be found under the
For Terraspace it is in
Both of them consist of the modules, stacks and resolved variables in *.tf files and give developers the possibility to debug either before applying or when something is not working as expected.
So which one should I choose?
Like always: it depends. The following table outlines the most important pros and cons of each one.
|Fairly easy to understand and reason about
||Not heavily opinionated which can lead to far-from-optimal project structures
|Convenient way of testing using nicely integrated 3rd party tool (with Go language which is commonly used across the DevOps world)
||Testing has to be added externally
|Just a wrapper, does one thing but does it right
||Some code duplication that can’t be avoided (multiple environments)
|Backed by a well-known commercial company (Gruntwork)
|Framework with richer set of features
||3rd party modules handling causes issues with IDEs
|Can be a nice entry point for people not experienced with Terraform at scale
||No way to run multiple test cases at once
|Opinionated and with strictly defined project structure
||Mixing Ruby’s code with Terraform causing issues with IDEs
|Highly extendable with extensions written in Ruby
Personally, Terragrunt seems to be my tool of choice when it comes to working with Terraform projects at scale.
From my perspective, it just does one thing and does it right. It focuses on making the work with Terraform more convenient, and it’s not trying to be overblown with features (like putting Ruby’s templates into Terraform code) because it’s just a wrapper, not a whole framework trying to provide many more additional features.
Obviously, it requires some expertise from a developer (especially when it comes to project structure definition and Terragrunt configuration file syntax) but seems to be focused on just making it easier to work with complex stacks and handling dependency between them.
What is worth mentioning too is that its maturity and adoption seem to be much greater than Terraspace. Of course, this should be compared and validated over the next few years because Terraspace is a much younger player in the market.
On the other hand, Terraspace can be an especially good fit for beginners or application developers who lack experience when it comes to working with enterprise-grade infrastructure code.
Of course, because it’s a framework it helps any other development team who expects a predictable project configuration controlled by established conventions. It offers a strict structure of a project, convenient CLI and code generators that improve productivity.
However, some decisions might feel a bit overcomplicated for developers used to working only with plain Terraform code (especially dependency management and optional Ruby code injection to extend functionality). One can say that all of those are optional, but in my opinion, you’re not selecting a whole framework just to use the well-established and predictable directories structure.
The choice is up to you: the good thing is that both of them will help you a lot in your day-to-day work and you definitely will appreciate going with one or the other instead of pure Terraform when working on a complex project.
The code quoted within this post can be found within this Github repository.