While working on my newest Pluralsight course, Getting Started with Terraform Cloud, I learned a lot about how Terraform Cloud functions and the services it includes. As I was bulding out the demonstrations, I kept thinking about real world environments and how you might go about organizing and managing Terraform Cloud in an SMB or a 10k seat enterprise. That led me down a rabbit hole of using the tfe
provider and Terraform Cloud to manage Terraform Cloud. Sounds confusing? It’s not! And I even went so far as to create a module to help you with the process.
Terraform Cloud (TFC) is a hosted service from HashiCorp that expands on the capabilities of Terraform OSS, including a graphical UI, managed state data, team based access control, and integrations with other products. There are a few core constructs to understand when it comes to how TFC is managed and organized:
The central management unit in TFC is the organization. It contains workspaces, teams, users, Sentinel policies, variable sets, and more. When you sign up for a TFC account, you have the option to create an organization through the UI. From there you can create all the other resources I just mentioned, like teams and workspaces. Although, I should mention that to use teams you’ll either need to move up to a paid plan or start a 30 day trial.
When you’re just getting started, building out TFC by hand in the UI is fine. No big deal. But just like anything else in the world of Terraform and Infrastructure as Code, it’s better if you can shift to defining things declaratively. Naturally, there is a a tfe
provider available to configure both Terraform Cloud and Terraform Enterprise using Terraform.
There are several benefits to using Terraform to configure TFC:
Basically, if you plan on running TFC at scale with tens or hundreds of workspaces, trying to manage through the UI will be a nightmare. So we are going to use Terraform and IaC to manage TFC.
But that begs a few questions.
The answer to the first question could simply be running Terraform OSS on your local desktop, but that’s no fun. You have TFC sitting there, just begging to be used. I suggest having a dedicated organization for configuring other organizations running in TFC. We’ll call it a Configuration Organization (CO). Access to the CO should be tightly controlled, and each managed org will have a dedicated workspace. You could compare this to an empty root domain in Active Directory or the management account for an AWS organization. Except there is no management hierarchy across organizations in TFC.
Each workspace in the CO will use the VCS workflow (more on that in a moment) to configure a Managed Organization (MO). Changes to a MO will be submitted through pull requests on a repository, and then applied when the PR is approved and merged. Since the workflow does not require anyone to log in to TFC (assuming you enable auto-apply on each MO workspace), the number of people that actually need to log into the CO will be minimal.
The best part is that due to the limited nature of the CO, you can stay on the Free tier of billing. It includes everything you’ll need. Unless of course you want to create teams and grant special access within the CO, but like I said, you should keep access to the CO down to just a few people. Like… five maybe? (That’s the max number of users in the free tier 😉.)
Probably the best place to store the configuration data and code is in a private repository on version control system (VCS). All changes and updates can be pushed through the standard GitOps process I described earlier. TFC includes a VCS workflow that can be tied back to your VCS repositories. When a commit is made to a tracked branch, that can kick off a run on TFC to apply the change to the target organization. By default, the run will stop before the apply and wait for someone to approve it, but you can enable auto-apply to skip the approval. After all, someone has already reviewed and approved the proposed changes on the VCS side, right?
The VCS workflow supports many options for triggering a run. It can track the default branch and root directory, a specific branch, and a particular directory. I see three possible setups based on these options.
Each managed organization could be a represented as a branch in your source control. Each branch would share the same basic Terraform code, but have a different configuration file to build out the MO associated with that branch. While it’s feasible, I would worry about the possibility of overwriting a branch through a merge and the difficulty of getting a clear picture of all your managed organizations.
An alternative is to have a default branch with a directory for each organization. The root folder would be empty, and each sub-directory would have the code and configuration files for an organization. When you want to make a change to an organization, you would simply update the configuration files in the corresponding directory. TFC would pick up on the change and start a run on the organization’s workspace in your CO. You could use the same code housed in a single directory for all the MOs, but you might want to experiment with changing the code on one organization before rolling it out to others. Chances are you won’t be running that many organizations simultaneously, so I don’t think it will be much of a burden keeping the code base the same across MO directories.
The final option is to have a separate repository for each organization. If you’re worried about locking down who can make changes to each organization, then this is probably your best option. You can grant different permissions for each repository and restrict who can access the repository, make commits, and approve pull requests. It would be more work to manage and maintain multiple repositories, but you’d get the benefit of granular permissions.
Of course, this begs the question: how many organizations are you going to have anyway?
Chances are that even the biggest enterprises won’t need more than one or two organizations. There is effectively no limit on how many workspaces and teams you can have in an organization. With the teams-based permissions model, you can easily have multiple applications coexist in the same organization. Additionally, workspaces that are in the same organization can share their state data with each other, something that is not possible across organizations in TFC. (It is possible in Terraform Enterprise, but that’s not my focus here.)
There are three probable reasons for multiple organizations in a company:
For the first two situations, using a single repository to manage all your organizations works. The MSP is managing all the organizations, so it doesn’t need to worry about separating out permissions. Likewise, if it’s purely a billing issue, the organizations are probably still managed by a single dedicated team.
If your business units are dead set on administering their own organization, then separating each organization into its own repository will probably make the most sense.
Ultimately, it’s unlikely you’re going to have tens or hundreds of organizations at your company. Pick the option that minimizes your administrative overhead, while still meeting the business requirements.
The more important resources to worry about are workspaces and teams in each organization. That is what we want to manage programmatically.
The TFE (Terraform Enterprise) provider in the public Terraform Registry can configure most aspects of a TFC organization, including workspaces and teams. However, you still need to put the pieces together yourself. And that’s why I decided to write a module for managing an organization. The primary idea is that the module should be able to create the following:
My first idea was to craft complicated variable objects to store all this information and apply it. That quickly became a nightmare, and I realized the best thing to do was store the configuration data in JSON and parse it with Terraform. Here’s the abbreviated format:
{
"workspaces": [
{
"name": "workspace_name",
"description": "workspace description",
"teams": [
{
"name": "team_name",
"access_level": "access_level"
}
],
"terraform_version": "1.1.0",
"tag_names": ["tag1"]
}
],
"teams": [
{
"name": "team_name",
"visibility": "visibility_level",
"organization_access": {
"manage_policies": true,
"manage_policy_overrides": true,
"manage_workspaces": false,
"manage_vcs_settings": false
},
"members": ["user_email_address"]
}
]
}
The list of users can be extrapolated from the list of team members, so we really only need the teams and workspaces.
Since there’s no sensitive information in the configuration data, it can be safely stored with the rest of the Terraform code. If you’re worried about the user email addresses being exposed, then you’ll need to add them some other way. Or you could skip creating users and simply create the teams you’d want users placed in.
With the information stored in JSON, I need to use it in Terraform. Importing JSON data is super easy with the jsondecode
and file
functions. Essentially the JSON is imported as a complex object, and I can use standard Terraform object reference syntax to extract information.
For instance, to get all the workspaces I simply do this:
locals {
org_data = jsondecode(file("${var.config_file_path}"))
workspaces = local.org_data.workspaces
}
From there, it’s a simpler matter of parsing the data with for_each
loops for every resource that needs to be generated.
The JSON format is also easy to extend. If I want to add support for VCS connections or configuring policy sets for Sentinel, I can simply add new resources and update the JSON format. But what happens to older configuration files that don’t have the new fields? No problem!
Using the try
function, I can normalize the JSON input in case the new JSON field hasn’t been added. For instance, if I add support for Variable Sets to the module, I can test for that field in the JSON input.
locals {
json_data = jsondecode(file("path_to_data.json))
variable_sets = try(local.json_data.variable_sets, {})
}
If the field variable_sets
doesn’t exist in the JSON file, the try
function will set the local value to an empty map. The for_each
argument in the variable sets resource will see an empty map and not create any instances. We can use the same logic to create normalized structures from our JSON data. I’ve implemented that logic in the latest version of my module, so you only need to add the JSON fields that are necessary for your configuration.
Using try
is pretty cool, and I have to admit I had no idea it even existed until I tried to figure out how to deal with missing values. Turns out I am not the first to encounter this problem. You’ll definitely see more about using try
in a different post.
If you plan on adopting Terraform Cloud in your organization and you’d like to manage it programmatically, I hope I’ve laid out a compelling argument for using Terraform Cloud to manage Terraform Cloud. Through the mechanism of a Configuration Organization and workspaces using the VCS workflow, you can automate the management of your TFC organizations with a GitOps style process. To save yourself some time, consider checking out the module I wrote on the public registry and let me know what you think!
Resourcely Guardrails and Blueprints
November 15, 2024
Deploying Azure Landing Zones with Terraform
November 12, 2024
October 18, 2024
What's New in the AzureRM Provider Version 4?
August 27, 2024