Terraform File Organization

At Unbounce, we have recently started using Terraform for creating our AWS infrastructure. It is working well for our needs but it does have some sharp edges, which is expected as it is a young project. The other configuration tooling we use is Cloudformation, and we provide its templates to the software development teams to use because its commands are fairly simple (and documentation is plentiful). Until our team can come up with a good user experience when working with Terraform, we will keep it for our internal infrastructure projects. That being said, we have put together some best practices for working with Terraform and keeping its scripts organized.

Directory Structure

For this article, we'll work with a mythical project "widget-maker", but its scripts are based upon a real project. This is its directory structure as it stands.

The project creates an auto-scaling group that sit behind a load balancer, all held within a VPC.

widget-maker$ ls -a
.git
.gitignore
Makefile
autoscaling.tf
load_balancers.tf
main.tf
networking.tf
authorization.tf
variables.tf
production-us-west-1.tfstate
production-us-west-1.tfplan

Use a Makefile

We use a Makefile as an orchestration tool. The Makefile knows how to use Terraform to plan and apply every facet of the application. This presents a familiar user experience for all team members, and it resolves some issues we have faced using Terraform with AWS.

Interacting with the Makefile is incredibly easy. The following shows how to ensure the infrastructure is consistent among each environment and region supported by the application.

widget-maker$ make plan
... verify that the plan looks okay for each region supported ...
widget-maker$ make apply

Below is the entire Makefile. In the subsequent sections, I explain the reasoning for each of the architecture decisions you will see.

.DEFAULT_GOAL := help
.PHONY: help plan apply deps

REGION := us-west-1

help:
  @echo "Builds the widget maker infrastructure"
  @echo ""
  @echo "Targets:"
  @echo "  apply  Commits the plan against the infrastructure"
  @echo "  deps   Ensures system requirements are met"
  @echo "  help   This message"
  @echo "  plan   Builds a new Terraform plan file"

deps:
  @hash $(TERRAFORM_BIN) > /dev/null 2>&1 || \
    (echo "Install terraform to continue"; exit 1)
  @test -n "$(AWS_ACCESS_KEY_ID)" || \
    (echo "AWS_ACCESS_KEY_ID env not set"; exit 1)
  @test -n "$(AWS_SECRET_ACCESS_KEY)" || \
    (echo "AWS_SECRET_ACCESS_KEY env not set"; exit 1)

plan: plan-prod-us-west-1

apply: apply-prod-us-west-1

plan-prod-us-west-1:
  $(TERRAFORM_BIN) plan \
    -var="aws_access_key=$(AWS_ACCESS_KEY_ID)" \
    -var="aws_secret_key=$(AWS_SECRET_ACCESS_KEY)" \
    -var="environment=production" \
    -var="aws_region=$(REGION)" \
    -state="production-$(REGION).tfstate" \
    -out="production-$(REGION).tfplan"

apply-prod-us-west-1:
  $(TERRAFORM_BIN) apply \
    -state="$(APP_ENV)-$(REGION).tfstate" \
    $(APP_ENV)-$(REGION).tfplan

Support for Multiple Regions

A lot of Terraform scripts I see that use AWS infrastructure only support one environment at a time, exposing the region in a .tfvars file. This works until multiple regions need to be supported. Now a developer must update their tfvars file, specify a new state and plan file, then perform the plan and apply steps without clobbering the other region.

If your application has multi-region failover (it does, doesn't it?) you need to find a better solution. We extract the region into a variable provided by make so that our app is supported in multiple regions. We also use variable mappings that are keyed to the region.

# variables.tf
variable "vpc_id" {
  default = {
    us-west-1 = "vpc-12345"
    us-west-2 = "vpc-67890"
  }
}

# autoscaling.tf
...
  vpc_id = "${lookup(var.vpc_id, var.aws_region)}"
...

Also of note is that Terraform does not support multiple AWS regions at this time. Our solution is the only way we've found to adequately workaround this limitation plus keep our infrastructure code modular.

Reduce Usage of tfvars

The .tfvars file contains any secrets that should not appear in the rest of the Terraform scripts. It is left out of the git repository by putting its name in the .gitignore file. Because the file is not in version control, we concluded that .tfvars is not a codified piece of the puzzle. Without codification, that becomes lost when a new person joins. We decided to move all variable values to either the Makefile to be specified at runtime (e.g. REGION) or to a environment variable which is loaded into the user's shell (e.g. AWS_ACCESS_KEY_ID). The deps target in the Makefile ensures that the environment variables exists and are set.

With this done, every component of the configuration of the terraform script is codified and requires little to no tribal knowledge for new team members.

Explicitly State AWS Credentials

Our provider block in our main.tf looks like this:

provider "aws" {
  access_key = "${var.aws_access_key}"
  secret_key = "${var.aws_secret_key}"
  region = "${var.aws_region}"
}

We explicitly state the AWS credentials in the provider block because we don't want to assume that the env vars are available, and that the caller may want to change the AWS credentials to a more restricted account.

In the variables.tf file we specify the AWS credentials without defaults, which means Terraform will complain if they are not set when calling plan.

# variables.tf
variable "aws_access_key" {
  description = "The AWS access key id credential"
}

variable "aws_secret_key" {
  description = "The AWS secret access key credential"
}

Remove Constants and Literals

Looking through our old Terraform scripts, we realized that we put numerous constants into the resources. We split out everything we could into a variable in the variables.tf file, with a sane default. Overuse of constants is a code smell and it makes it difficult to understand the intent of the constant. By turning it into a variable, you can give the constant its own name and provide more details in the description component of the variable declaration.

Two side benefits of this approach are that you can (a) override any of these variables at runtime, and (b) get a better idea of an pivot variables.

Keep main.tf Short

Some Terraform script put everything inside the main.tf file, resulting in a massive file you are not able to fully understand. This is the same as putting all of your code into a single shell script. Terraform makes it easy to modularize your code by concatenating any .tf files before making a .tfplan file.

We only use the main.tf file for specifying the AWS provider. Everything else is split into a separate .tf file corresponding to its function. In our case, anything related to autoscaling (auto-scaling groups, instance profiles, launch configurations), is split into autoscaling.tf.

This allows anyone to understand the scope of the file, as well as any changes are contained to a small changeset.

Give Proper Names to State Files

Don't use terraform.tfstate file names. It is a default and too generic for any environment-specific or region-specific infrastructure. Use a name indicative of the thing you are provisioning.

For any piece of our infrastructure, we identify the variables on which it pivots, and then name the state files after that. In the case of widget-maker we knew that it would need to support multiple regions and environments. This is why there are two pivot variables.

Don't include a destroy target

When building infrastructure that is long-lived, you want to make the creation components easy, but make the user jump through hoops when they want to destroy something. These aren't necessarily difficult hoops, but we don't want to automate the destruction of long-lived resources, especially if destroying a resource is a rare occurrence. This reduces the possibility of a mistake or error.