slide

State encryption with open TOFU

Ned Bellavance
17 min read

Cover

Wondering how to encrypt your state data using OpenTofu? Then this is the post for you!

If you’d prefer your information in video format, check out the Terraform Tuesday video instead.

Introduction

For this post, we are going to return once again to OpenTofu. If you’re not familiar with OpenTofu and how it differs from Terraform, then you may want to check out the blog post I’ve been maintaining that compare the differences between the two.

In this post, I wanted to dig into the data encryption feature that was introduced with OpenTofu 1.7. We’ll look at the feature’s history, what problem it is attempting to solve, and how to add or remove encryption on your state data and plan files.

If all you care about is the actual nuts and bolts, jump to this section, but I do think you need to consider whether this feature is right for you first. That requires some context and history.

Why Encrypt Your State Data?

Before OpenTofu launched, one of the most popular requests on the Terraform list of open issues was to support encryption of state data and plan files at rest. And the issue languished for years, despite some possible solutions being put forth. HashiCorp, as maintainers of the Terraform repository, argued that using Terraform to encrypt state data directly was not a sound idea in practice and plan files should be ephemeral. Rather than implement a solution they chose to leave the issue open.

One of the big promises of OpenTofu was that this feature would finally be added and the issue resolved. And in OpenTofu 1.7 it was! That’s quite an accomplishment and I have to give the OpenTofu team their flowers for delivering. I also feel I should mention that this feature is not compatible with Terraform and you would have to remove encryption if you wanted to migrate to Terraform or have a third-party tool interact with your state data or plan files directly.

Why were people asking for this feature? Mostly because state data and plan files can contain sensitive information that you don’t want others to have access to. That could be obvious stuff like API keys, passwords, and tokens, but it could also be the less obvious fact that anyone who gets a copy of your state data or a plan file has a map of your deployed infrastructure. They might be able to use that map to develop a plan of attack or find weaknesses to be exploited.

Having OpenTofu encrypt state data and plan files at rest means that even if someone gains access to wherever your state data is stored, they still won’t be able to read it without the necessary keys.

What other options are there?

In defense of HashiCorp, there are lots of ways to encrypt state data at rest that do not rely on managing your own encryption keys. The most popular state backends all support encryption of state at rest and in transit, and allow you to apply robust access permissions.

Azure storage, for instance, encrypts data by default, and you can bring your own key for both data encryption and infrastructure encryption. You can apply access control through a number of mechanisms, whether its SAS tokens, Azure AD RBAC, or attribute-based access control. If you want to know just how much you can lock down Azure storage, check out my video on the topic.

So aside from adding another layer of complexity and more keys to manage, what does this additional encryption facility buy you? Well, if you don’t trust the backend you’re using to properly store and protect your state data, then encrypting state with OpenTofu removes the onus from the platform and places it squarely on you. Especially if you’re using some generic HTTP backend for state, you may was to manage the encryption yourself.

It also provides a way for you to encrypt your state data if its being stored locally. I know we should all be storing our state on a remote backend, but that’s not always the reality. You could use another encryption solution to manage encryption at rest on your filesystem, but OpenTofu makes it so you can rely on the encryption provided directly by OpenTofu itself.

The option to encrypt plan files is the real benefit in my estimation. When you save a plan file for review and execution, it contains a ton of information, including planned changes, new values, and updated outputs. If that plan file falls into the wrong hands, even if the plan itself is out of date, an attacker will have more than enough information to launch a attack on your infrastructure. And that’s assuming the API keys weren’t passed directly as input variables.

Again, there are existing ways to protect against this. Depending on the state backend you’re using, you could stash you plan files next to your state data. The same encyprtion and security you’ve applied to your state would now apply to those plan files. Or you could stash it on any respectable object-based storage and apply the necessary encryption and permissions. However, that’s not baked into Terraform in any way. That’s something you would need to automate yourself or get from a TACO platform.

All that being said, if I had to pick a winning use case for OpenTofu’s encryption feature, it’s encrypting plan files. Let’s see how it all works!

Encrypting State Data With OpenTofu

First let’s lay down some terminology and syntax. The encryption settings for state and plan data can be configured in two ways: using an encryption block inside of the terraform block, through the environment variable TF_ENCRYPTION, or a combination of the two. You can use either JSON or HCL to describe the encryption settings, though I think you already know my general feelings on JSON 🤢.

Encryption requires two components, a key provider and a method. The key provider defines where OpenTofu is getting the key material to perform the encryption, and the method describes how the key is used to encrypt the data. Wow, it’s like right there in the name isn’t it?

The method is used to apply encryption to state data and plan files. State and plan encryption is managed separately, so you can choose to use one form of encryption for state and another for plan. Or you can choose to apply encryption to one and not the other. Hooray for options!

So what kind of key providers are available right now? As of writing there are four:

  • pbkdf2 - is used to generate a key locally using a passphrase. The passphrase is stored with the configuration or injected with the TF_ENCRYPTION environment variable.
  • aws_kms - uses the AWS KMS service for the encryption key.
  • gcp_kms - uses the Google Cloud KMS service for the encryption key.
  • openbao - uses the transit engine in OpenBao for the encryption key.

And if you’re wondering what the hell OpenBao is, it’s the open source fork of HashiCorp Vault. The OpenBao key provider will also work with Vault versions under the MPL license, which is 1.14 or older.

Azure’s Key Vault is not yet on the list, but I’m sure that’s being worked on right now. Plus, this is open source, so if your preferred encryption service isn’t on the list, you can implement it yourself.

There are only two methods available for encryption, and really there’s only one that actually encrypts things. aes_gcm uses, well, AES-GCM, which is an industry standard encryption process that uses symmetric keys to encrypt and decrypt data.

The other method is called unencrypted and is used for migration scenarios.

Let’s put this all into practice with a simple example. If you want to follow along, check out this directory on my Terraform Tuesdays repository.

Encrypting Locally Example

We’ll start with a basic example that encrypts data locally using the pbkdf2 key provider. Let’s take a look at the terraform.tf file. In here, I’ve got the encryption block nested inside the terraform block.

terraform {
  encryption {
    key_provider "pbkdf2" "passphrase" {
      passphrase    = "tacos-are-delicious-and-nutritious"
      key_length    = 32
      iterations    = 600000
      salt_length   = 32
      hash_function = "sha512"
    }

To define a key provider, I have a key_provider block type, which takes two labels. The first is the provider type, which I have set to pbkdf2 and the second is a name label called passphrase.

Inside the block I have some arguments that set the passphrase, key length, and some other options. Many of these arguments have default values which you can go with instead of adding them explicitly.

Once I have my key provider, I need a method block to define which encryption method I’m using and which key to use for it.

    method "aes_gcm" "passphrase_gcm" {
      keys = key_provider.pbkdf2.passphrase
    }

The method block has two labels. The method type, which in my case is aes_gcm, and a name label, for which I’m using passphrase_gcm.

Inside the block, the only argument is the keys argument, which I set to the key provider block identifier.

To use this method in my configuration, I can specify a state and/or plan block.

    state {
      method = method.aes_gcm.passphrase_gcm
    }

    plan {
      method = method.aes_gcm.passphrase_gcm
    }

Neither of these block types take labels. Inside the block, there is the argument method, which is set to the method identifier.

I have both state and plan set, so any state data or plan files I create should be encrypted using this method. Here’s my main.tf file.

resource "local_file" "main" {
  content  = "Encrypt state and plan!"
  filename = "${path.module}/testplan.txt"
}

output "test" {
  value = local_file.main.filename
}

I’m simply creating a local file and an output set to that file name. I haven’t defined a state backend in this configuration, so we’ll be using the local file backend.

We’ll start by running tofu init as you normally would.

$ tofu init

Initializing the backend...

Initializing provider plugins...
- Reusing previous version of hashicorp/local from the dependency lock file
- Installing hashicorp/local v2.5.1...
- Installed hashicorp/local v2.5.1 (signed, key ID 0C0AF313E5FD9F80)

Providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://opentofu.org/docs/cli/plugins/signing/

OpenTofu has been successfully initialized!

Now I’m going to run a plan and save the output to a file.

$ tofu plan -out plan.tfplan

...

Plan: 1 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + test = "./testplan.txt"

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 

Saved the plan to: plan.tfplan

Usually a plan file is in an unreadable binary form that you need to use tofu show to express as plaintext or JSON. What’s in the plan.tfplan file?

{
    "meta":{
        "key_provider.pbkdf2.passphrase":"eyJzYWx0IjoiR2drM3BtZlJpTWEycS9ET3BJNkJrZlgvSk5xZUV0eW9FMX..."},
        "encrypted_data":"s4s...cYDi9Q==",
        "encryption_version":"v0"
}

I truncated the immensely long strings, but basically we have a JSON file that defines the key provider, encrypted data, and encryption version. I’m assuming newer versions of the encryption process will be released, so encryption_version helps OpenTofu know what version encrypted this payload. The encrypted payload also appears to be base64 encoded, making it safe for HTTP transmission.

Can I still view the execution plan? Sure! If I run tofu show, I will get back the unencrypted contents of the plan.

$ tofu show plan.tfplan

OpenTofu used the selected providers to generate the following execution plan. Resource actions are indicated with the following
symbols:
  + create

OpenTofu will perform the following actions:

  # local_file.main will be created
  + resource "local_file" "main" {
      + content              = "Encrypt state and plan!"
      + content_base64sha256 = (known after apply)
      + content_base64sha512 = (known after apply)
      + content_md5          = (known after apply)
      + content_sha1         = (known after apply)
      + content_sha256       = (known after apply)
      + content_sha512       = (known after apply)
      + directory_permission = "0777"
      + file_permission      = "0777"
      + filename             = "./testplan.txt"
      + id                   = (known after apply)
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + test = "./testplan.txt"

The catch is that if I want to evaluate the plan information, I need to do so through OpenTofu. Third party tools that read an execution plan directly will need OpenTofu to do the decryption for them.

Let’s apply this plan by running tofu apply:

$ tofu apply plan.tfplan
local_file.main: Creating...
local_file.main: Creation complete after 0s [id=857d8bfe0af9c5a506e28f3a48e2dad06427e4d6]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Outputs:

test = "./testplan.txt"

Once everything is created, I now have a terraform.tfstate file in my current working directory. Looking at it’s contents:

{
    "serial": 1,
    "lineage": "400979ee-83b6-8cb0-3a04-70f6f885ce14",
    "meta": {
        "key_provider.pbkdf2.passphrase": "eyJzYW...GVuZ3RoIjozMn0="
    },
    "encrypted_data": "zoRtfr0Sa...dB6Rw==",
    "encryption_version": "v0"
}

It’s very similar to the plan file. The serial number and lineage are still in plaintext, so OpenTofu can still do a comparison with a stored plan without decrypting the file. Beyond that is the encryption information and then the state data payload. I can still interact with the state by using tofu commands, but we cannot view the state data directly.

That’s a basic example, but it relies on setting a passphrase, and I don’t feel great about that in an automation setting. Let’s take a look at an example that uses AWS KMS and has existing state data I want to encrypt.

AWS KMS Example

Alright, let’s say I already have a deployed configuration and I want to add encryption on top of it. How do I migrate to encrypted state?

To start with, we need an instance of AWS KMS and an S3 bucket to hold our state data. Here’s a configuration that will create those resources:

provider "aws" {
  region = "us-west-2"
}

resource "random_integer" "bucket_suffix" {
  min = 10000
  max = 99999
}

data "aws_caller_identity" "current" {}

// Create a KMS key
resource "aws_kms_key" "tofu_key" {
  description              = "Tofu encryption key"
  enable_key_rotation      = true
  deletion_window_in_days  = 10
  key_usage                = "ENCRYPT_DECRYPT"
  customer_master_key_spec = "SYMMETRIC_DEFAULT"

  policy = jsonencode({
    Version = "2012-10-17"
    Id      = "key-default-1"
    Statement = [
      {
        Sid    = "Enable IAM User Permissions"
        Effect = "Allow"
        Principal = {
          AWS = "${data.aws_caller_identity.current.arn}"
        },
        Action   = "kms:*"
        Resource = "*"
      }
    ]
  })

}

module "terraform_state_backend" {
  source = "cloudposse/tfstate-backend/aws"
  version     = "1.4.1"
  
  force_destroy = true
  bucket_enabled = true
  dynamodb_enabled = true
  name = "encrypted${random_integer.bucket_suffix.result}"
  environment = "test"
  namespace = "tofu"

}

output "bucket_name" {
  value = module.terraform_state_backend.s3_bucket_id
}

output "dynamodb_table_name" {
  value = module.terraform_state_backend.dynamodb_table_name
}

output "kms_id" {
  value = aws_kms_key.tofu_key.id
}

You might notice that the key usage is set to ENCRYPT_DECRYPT and the key spec is SYMMETRIC_DEFAULT. Those are your only options at the moment. OpenTofu is using symmetric encryption to protect the state and plan files. That means the same key is used for both encryption and decryption versus having a public/private keypair.

As of right now, you cannot select an RSA or ECC key spec because those are asymmetric, meaning that the private key never leaves AWS KMS. Tofu would have to send the encrypted data up to AWS KMS to perform a decrypt operation and then get the unencrypted data back. That’s not how state data encryption works at the moment.

For this example, let’s assume I’ve already deployed a configuration using the S3 bucket as a backend. My terraform.tf file looks like this:

terraform {
  backend "s3" {
    region  = "us-west-2"
    bucket  = "tofu-test-encrypted70785"
    key     = "terraform.tfstate"
    encrypt = "true"

    dynamodb_table = "tofu-test-encrypted70785-lock"
  }
}

And the main.tf is the same as the basic example:

resource "local_file" "main" {
  content  = "Encrypt state and plan again!"
  filename = "${path.module}/testplan2.txt"
}

output "test" {
  value = local_file.main.filename
}

Looking at the state data stored up in S3, it is unencrypted:

{
  "version": 4,
  "terraform_version": "1.7.2",
  "serial": 1,
  "lineage": "9b6fee0e-b62d-1599-4a36-058391b98fe1",
  "outputs": {
    ...
}

Now I want to apply encryption to state, but maybe I don’t want to hardcode the KMS key ID into the terraform block. I can use the TF_ENCRYPTION environment variable instead. That’s useful for an automation scenario where you want the same pipeline code to support multiple environments.

Here’s a PowerShell script that uses heredoc syntax to populate the TF_ENCRYPTION environment variable.

$env:TF_ENCRYPTION = @"
key_provider "aws_kms" "tofu" {
  kms_key_id = "62b4299c-0e4e-4323-98dc-e185d8dfe7b9"
  region = "us-west-2"
  key_spec = "AES_256"
}

method "aes_gcm" "tofu" {
  keys = key_provider.aws_kms.tofu
}

method "unencrypted" "tofu" {}

state {
  method = method.aes_gcm.tofu

  fallback {
    method = method.unencrypted.tofu
  }
}

plan {
  method = method.aes_gcm.tofu

  fallback {
    method = method.unencrypted.tofu
  }
}
"@

The content of the heredoc string is basically HCL with my encryption settings. That includes the key provider of type aws_kms with the key id, region, and key_spec.

key_provider "aws_kms" "tofu" {
  kms_key_id = "62b4299c-0e4e-4323-98dc-e185d8dfe7b9"
  region = "us-west-2"
  key_spec = "AES_256"
}

The method is aes_gcm and the state and plan blocks are both using that method for encryption.

method "aes_gcm" "tofu" {
  keys = key_provider.aws_kms.tofu
}

But what’s this other method? Unencrypted?

method "unencrypted" "tofu" {}

This method is specifically for when you want to set up or remove encryption. Unencrypted simply means the method doesn’t encrypt data.

Looking in the state block:

state {
  method = method.aes_gcm.tofu

  fallback {
    method = method.unencrypted.tofu
  }
}

There is a fallback block that uses the unencrypted method. When we run a tofu plan, it wants to use the aes_gcm method for handling state, but the existing state data doesn’t match that method. Normally OpenTofu would stop here and error out. It would look at the state data, not find a matching encryption method, and halt.

The fallback block gives it a second method to try if the first one doesn’t pan out. Currently our state data is not encrypted, so the appropriate fallback method type would be unencrypted. When it writes the updated state data back out during an apply, it will use the aes_gcm method, and our data will be encrypted.

Let’s try it! After running the PowerShell script to set up my TF_ENCRYPTION environment variable, I’ll run an apply.

$ tofu apply -auto-approve
local_file.main: Refreshing state... [id=5d049985f49cb38a2ebf7c38f2b1b185cc65f7c7]

...

Apply complete! Resources: 1 added, 0 changed, 1 destroyed.

Outputs:

test = "./testplan2.txt"

Once the apply is complete, let’s take a look at the contents of our state data:

{
    "serial":2,"lineage":"9b6fee0e-b62d-1599-4a36-058391b98fe1",
    "meta":{
        "key_provider.aws_kms.tofu":"eyJjaX...MEE9PSJ9"
    },
    "encrypted_data":"0pOk5...0Zojj0=",
    "encryption_version":"v0"
}

It appears that the state data is now encrypted, and based on the metadata, it’s using AWS KMS for the key provider. Neat!

If I wanted to go back to the unencrypted format, I would swap the method and fallback method. OpenTofu will read in the state data using the fallback method aes_gcm and write it back using the unencrypted primary method. You can do the same thing to change key providers or encryption methods as more become available.

Precaution and Warnings

State data is like, really, really important. If you lose access to your state data, you’ve got a long day ahead of you. Sure import blocks and discovery tools make it a little easier, but it’s going to make for a bad time, especially if you’re using the same encryption key for multiple state data instances and you lose it. AWS KMS keys are cheap, so you should probably use a different key for each environment. There’s a tradeoff though. It creates another thing for your team to juggle and if someone gains access to those AWS KMS keys, they can really wreck your day!

Second, although you’re protecting your state data and plan files from an attacker who somehow gained access to where they’re stored, you aren’t protecting your state data and plan files from someone who needs to work with the configuration on a regular basis. That person needs to have access to state data, and they can easily make an unencrypted copy of state data or a plan file using tofu state pull or tofu show. It’s not storing your data in a secure enclave that no one can ever access or see inside.

I think you need to ask yourself, is adding this level of encryption is actually worthwhile? If you’ve secured your state backend properly, unauthorized people shouldn’t be able to get to your state data anyway. Whether you’re using an automation platform like GitLab or env0 or self-managing your state with S3 or Azure storage, you have the ability to lock down access and encrypt data at rest and in transit. Adding another layer of encryption might buy you a sliver of additional security, but is that worth the additional cost of managing keys and potentially losing access to state data? I can’t answer that for you, that’s something you’ll need to decide with your Security and Risk teams.

My personal feeling is that native encryption of state and plan files is not a huge boon to security, but it’s nice to have the option. And it’s a feather in the cap of OpenTofu that they can offer this feature for those who have been waiting forever to see it implemented. I’m curious to hear what you think. Will you be using this feature to encrypt your state or plan? Why or why not? Hit me up on LinkedIn or use the contact form.