Step 2: Refactoring
One of the most important skills you can learn when managing IaC is learning how to refactor IaC. The code you write directly relates to infrastructure that delivers value to your team and wider organization, so knowing how to safely reorganize your code so that it’s easier to reuse and reason about without incurring risk for the infrastructure you support is invaluable.
Some of the most common reasons you might engage in this kind of refactoring includes:
- Rewriting bespoke IaC as consumption of a reusable module so that you can repurpose the common IaC in other environments/projects.
- Standardizing IaC patterns for consistent application of security/cost best practices.
- Abstracting away the implementation details of one or more resources as a module so that you can focus on the higher level abstraction of how that module integrates with the rest of your infrastructure.
- Creating a generic module with a well defined API for a component in your infrastructure so that you can easily swap out the module with another module that shares a compatible (or close enough to compatible) API.
In this step, we’ll start going down the road of making our infrastructure components modular so that we are well prepared for the next step, when we introduce a secondary environment as a replica of the environment we provisioned in the last step.
Tutorial
Section titled “Tutorial”The Gruntwork recommended best practice for creating reusable IaC is to create a dedicated catalog
directory (or a dedicated catalog
repository) outside the live
directory (or live
repository) where reusable IaC patterns like OpenTofu/Terraform modules are stored.
To reorganize the resources that we’ve created so far into reusable modules, we’ll create a directory called catalog/modules
where we can store our modules for reusability. We’ll create an OpenTofu module for each piece of high-level functionality that we are provisioning in our current environment (s3
, lambda
, iam
and ddb
).
mkdir -p catalog/modules/{s3, lambda, iam, ddb}
Now we can move over the files that were provisioning these independent resources into their own modules so we can establish APIs for them and start reusing some of this code. It’s a pretty standard convention to name the core file used in a module main.tf
. Good modules do one thing, and if you can’t figure out what a module does by the name of the module, it’s probably indicative that you’re making an odd abstraction.
mv live/ddb.tf catalog/modules/ddb/main.tfmv live/iam.tf catalog/modules/iam/main.tfmv live/data.tf catalog/modules/iam/data.tfmv live/lambda.tf catalog/modules/lambda/main.tfmv live/s3.tf catalog/modules/s3/main.tf
The contents of some of these files need a little massaging, however, as the IaC didn’t have clear boundaries between the constituent components. Let’s fix that by providing an API for each of these modules in the form of variables for inputs and outputs…. for outputs.
variable "name" { description = "The name of the DynamoDB table" type = string}
output "name" { value = aws_dynamodb_table.asset_metadata.name}
output "arn" { value = aws_dynamodb_table.asset_metadata.arn}
variable "name" { description = "The name of the S3 bucket" type = string}
variable "force_destroy" { description = "Force destroy S3 buckets (only set to true for testing or cleanup of demo environments)" type = bool default = false}
output "name" { value = aws_s3_bucket.static_assets.bucket}
output "arn" { value = aws_s3_bucket.static_assets.arn}
variable "name" { description = "The name of the IAM role" type = string}
variable "aws_region" { description = "The AWS region to deploy the resources to" type = string}
variable "s3_bucket_arn" { description = "The ARN of the S3 bucket" type = string}
variable "dynamodb_table_arn" { description = "The ARN of the DynamoDB table" type = string}
output "name" { value = aws_iam_role.lambda_role.name}
output "arn" { value = aws_iam_role.lambda_role.arn}
For the iam
module, we’re also going to need to make adjustments to the main.tf
file to account for previous tight coupling between resources. The updates here take advantage of those new s3_bucket_arn
and dynamodb_table_arn
variables for message passing between modules, exposed by the outputs of the ddb
and s3
modules.
resource "aws_iam_role" "lambda_role" { name = "${var.name}-lambda-role"
assume_role_policy = jsonencode({ Version = "2012-10-17" Statement = [ { Action = "sts:AssumeRole" Effect = "Allow" Principal = { Service = "lambda.amazonaws.com" } } ] })}
resource "aws_iam_policy" "lambda_s3_read" { name = "${var.name}-lambda-s3-read" description = "Policy for Lambda to read from S3 bucket"
policy = jsonencode({ Version = "2012-10-17" Statement = [ { Effect = "Allow" Action = [ "s3:GetObject", "s3:ListBucket" ] Resource = [ var.s3_bucket_arn, "${var.s3_bucket_arn}/*" ] } ] })}
resource "aws_iam_policy" "lambda_dynamodb" { name = "${var.name}-lambda-dynamodb" description = "Policy for Lambda to read/write to DynamoDB table"
policy = jsonencode({ Version = "2012-10-17" Statement = [ { Effect = "Allow" Action = [ "dynamodb:GetItem", "dynamodb:PutItem", "dynamodb:UpdateItem", "dynamodb:DeleteItem", "dynamodb:Query", "dynamodb:Scan" ] Resource = var.dynamodb_table_arn } ] })}
resource "aws_iam_policy" "lambda_basic_execution" { name = "${var.name}-lambda-basic-execution" description = "Policy for Lambda basic execution (CloudWatch logs)"
policy = jsonencode({ Version = "2012-10-17" Statement = [ { Effect = "Allow" Action = [ "logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents" ] Resource = "arn:aws:logs:${var.aws_region}:${data.aws_caller_identity.current.account_id}:*" } ] })}
resource "aws_iam_role_policy_attachment" "lambda_s3_read" { role = aws_iam_role.lambda_role.name policy_arn = aws_iam_policy.lambda_s3_read.arn}
resource "aws_iam_role_policy_attachment" "lambda_dynamodb" { role = aws_iam_role.lambda_role.name policy_arn = aws_iam_policy.lambda_dynamodb.arn}
resource "aws_iam_role_policy_attachment" "lambda_basic_execution" { role = aws_iam_role.lambda_role.name policy_arn = aws_iam_policy.lambda_basic_execution.arn}
variable "lambda_runtime" { description = "Lambda function runtime" type = string default = "nodejs22.x"}
variable "lambda_handler" { description = "Lambda function handler" type = string default = "index.handler"}
variable "lambda_timeout" { description = "Lambda function timeout in seconds" type = number default = 30}
variable "lambda_memory_size" { description = "Lambda function memory size in MB" type = number default = 128}
variable "lambda_architectures" { description = "Lambda function architectures" type = list(string) default = ["arm64"]}
variable "name" { description = "Name used for all resources" type = string}
variable "aws_region" { description = "AWS region to deploy the resources to" type = string}
variable "lambda_zip_file" { description = "Path to the Lambda function zip file" type = string}
variable "lambda_role_arn" { description = "Lambda function role ARN" type = string}
variable "s3_bucket_name" { description = "S3 bucket name" type = string}
variable "dynamodb_table_name" { description = "DynamoDB table name" type = string}
output "name" { value = aws_lambda_function.main.function_name}
output "arn" { value = aws_lambda_function.main.arn}
output "url" { value = aws_lambda_function_url.main.function_url}
Again, for the Lambda module we’re going to need to make updates to the main.tf
file to account for the tight coupling between resources now that we’re wiring them together via variables and outputs.
resource "aws_lambda_function" "main" { function_name = "${var.name}-function"
filename = var.lambda_zip_file source_code_hash = filebase64sha256(var.lambda_zip_file)
role = var.lambda_role_arn
handler = var.lambda_handler runtime = var.lambda_runtime timeout = var.lambda_timeout memory_size = var.lambda_memory_size architectures = var.lambda_architectures
environment { variables = { S3_BUCKET_NAME = var.s3_bucket_name DYNAMODB_TABLE_NAME = var.dynamodb_table_name } }}
resource "aws_lambda_function_url" "main" { function_name = aws_lambda_function.main.function_name authorization_type = "NONE"}
Let’s make sure that our modules have a copy of the versions.tf
file that was in the root module (if you’re not comfortable with using the find
command below, you can just copy the versions.tf
file into each of the modules you’ve created so far manually). It’s a best practice to have reusable modules define their version constraints so that they can explicitly signal to module consumers when they use features in newer provider versions that might require a provider upgrade or are dodging a bug in a particular version of a provider that consumers should avoid.
find catalog/modules -mindepth 1 -type d -exec cp live/versions.tf {}/versions.tf \;
To use these modules, we need to use OpenTofu module
blocks to reference them in a new main.tf
file placed in the live
directory (the OpenTofu root module).
What we’re doing here is simply instantiating each of the modules that we’ve created so far by referencing them using a relative path to the module in the source
attribute, setting values for their required inputs (some of which are acquired as outputs from other modules).
module "s3" { source = "../catalog/modules/s3"
name = var.name
force_destroy = var.force_destroy}
module "ddb" { source = "../catalog/modules/ddb"
name = var.name}
module "iam" { source = "../catalog/modules/iam"
name = var.name
aws_region = var.aws_region
s3_bucket_arn = module.s3.arn dynamodb_table_arn = module.ddb.arn}
module "lambda" { source = "../catalog/modules/lambda"
name = var.name
aws_region = var.aws_region
s3_bucket_name = module.s3.name dynamodb_table_name = module.ddb.name lambda_zip_file = var.lambda_zip_file lambda_role_arn = module.iam.arn}
We also want to forward outputs from these modules into our root module so that we can access them from the tofu
CLI.
output "lambda_function_url" { description = "URL of the Lambda function" value = module.lambda.url}
output "lambda_function_name" { description = "Name of the Lambda function" value = module.lambda.name}
output "s3_bucket_name" { description = "Name of the S3 bucket for static assets" value = module.s3.name}
output "s3_bucket_arn" { description = "ARN of the S3 bucket for static assets" value = module.s3.arn}
output "dynamodb_table_name" { description = "Name of the DynamoDB table for asset metadata" value = module.ddb.name}
output "dynamodb_table_arn" { description = "ARN of the DynamoDB table for asset metadata" value = module.ddb.arn}
output "lambda_role_arn" { description = "ARN of the Lambda execution role" value = module.iam.arn}
We can also reduce the amount of content in the optional variables file, now that each of the modules define the variables that matter to them. This keeps the API of each module clean, as each module exposes the variables that specifically control them.
variable "aws_region" { description = "AWS region for all resources" type = string default = "us-east-1"}
variable "force_destroy" { description = "Force destroy S3 buckets (only set to true for testing or cleanup of demo environments)" type = bool default = false}
After all this refactoring, we’ll want to run a plan
to make sure we can safely apply our changes.
$ tofu init$ tofu plan...
Plan: 11 to add, 0 to change, 11 to destroy....
Oh no! After all our refactors, we’ve introduced changes that would completely destroy all of the infrastructure we’ve created so far!
This is a common scenario that you need to become comfortable with as you learn how to refactor and adjust IaC for scalability and maintainability. You leveraged the built-in protections of plans to give you a dry-run of your infrastructure updates, and can reason about why OpenTofu is trying to do what it’s doing here to avoid catastrophe.
We, as authors of the IaC, know that all we’ve done in this step is move some files into different directories, but as far as OpenTofu is concerned, we’ve deleted resources at addresses like the following:
# aws_lambda_function.main will be destroyed # (because aws_lambda_function.main is not in configuration)
And introduced resources at addresses the like the following:
# module.lambda.aws_lambda_function.main will be created
The reason for this is that OpenTofu doesn’t really have a way of knowing the difference between moving a file like that for the sake of reorganization and completely removing infrastructure in one place and adding it in another without some help from IaC authors.
The way we communicate to OpenTofu that a resource at one address has simply moved to a new address is to introduce move
blocks.
For each resource that we want to move, we’ll want to introduce a move
block with a from
of the old address (what OpenTofu reports as being destroyed in our plan) and a to
of the equivalent new address (what OpenTofu reports as being created in our plan).
moved { from = aws_dynamodb_table.asset_metadata to = module.ddb.aws_dynamodb_table.asset_metadata}
moved { from = aws_iam_policy.lambda_basic_execution to = module.iam.aws_iam_policy.lambda_basic_execution}
moved { from = aws_iam_policy.lambda_dynamodb to = module.iam.aws_iam_policy.lambda_dynamodb}
moved { from = aws_iam_policy.lambda_s3_read to = module.iam.aws_iam_policy.lambda_s3_read}
moved { from = aws_iam_role.lambda_role to = module.iam.aws_iam_role.lambda_role}
moved { from = aws_iam_role_policy_attachment.lambda_basic_execution to = module.iam.aws_iam_role_policy_attachment.lambda_basic_execution}
moved { from = aws_iam_role_policy_attachment.lambda_dynamodb to = module.iam.aws_iam_role_policy_attachment.lambda_dynamodb}
moved { from = aws_iam_role_policy_attachment.lambda_s3_read to = module.iam.aws_iam_role_policy_attachment.lambda_s3_read}
moved { from = aws_lambda_function.main to = module.lambda.aws_lambda_function.main}
moved { from = aws_lambda_function_url.main to = module.lambda.aws_lambda_function_url.main}
moved { from = aws_s3_bucket.static_assets to = module.s3.aws_s3_bucket.static_assets}
It’s worth noting that we haven’t been working with any infrastructure that’s important to preserve in this demo so far. We can easily reproduce this infrastructure without much effort. It’s important to know how to perform refactors without having to recreate infrastructure, though, as we need to be able to avoid paying the penalty of outages or data loss — especially when working with production infrastructure.
If, for example, the database or s3 bucket being managed here had real customer information, it would be extremely important to avoid recreating these resources. OpenTofu doesn’t always know that recreating a stateful resource can cause permanent data loss. If you want the benefits we mentioned earlier of refactored IaC, you’ll want to know how to carefully handle state manipulation in OpenTofu and understand what it’s trying to do.
So, as a small tangent, let’s discuss what actually happened when we introduced these moved
blocks. There are multiple ways to configure OpenTofu backend state configurations, but the way that we’ve configured it here is to have the state files stored in S3 as JSON files. What we did under the hood with our moved
blocks was update the content of that JSON file in s3://[your-state-bucket]/tofu.tfstate
so that each of the resources
in your state file used updated values for their resource addresses.
In the example of this move:
moved { from = aws_dynamodb_table.asset_metadata to = module.ddb.aws_dynamodb_table.asset_metadata}
We updated one of the JSON objects in the state file from one that had these values:
# Some stuff "mode": "managed", "type": "aws_dynamodb_table", "name": "asset_metadata", "provider": "provider[\"registry.opentofu.org/hashicorp/aws\"]", # More stuff
To one that had these values:
# Some stuff "module": "module.ddb", "mode": "managed", "type": "aws_dynamodb_table", "name": "asset_metadata", "provider": "provider[\"registry.opentofu.org/hashicorp/aws\"]", # More stuff
When OpenTofu wants to know the current state of aws_dynamodb_table.asset_metadata
, it can look it up using the first value, and when it wants to lookup the state of module.ddb.aws_dynamodb_table.asset_metadata
it uses the second value.
By moving the value in state, we’re just telling OpenTofu that we’re calling the resource by a different name now, without actually changing anything in AWS.
Project Layout Check-in
Section titled “Project Layout Check-in”You should have a filesystem layout that look like the following for your IaC now:
Directorycatalog
Directorymodules
Directoryddb
- main.tf
- outputs.tf
- vars-required.tf
- versions.tf
Directoryiam
- main.tf
- outputs.tf
- vars-required.tf
- versions.tf
Directorylambda
- main.tf
- outputs.tf
- vars-optional.tf
- vars-required.tf
- versions.tf
Directorys3
- main.tf
- outputs.tf
- vars-optional.tf
- vars-required.tf
- versions.tf
Directorylive
- backend.tf
- main.tf
- moved.tf
- outputs.tf
- providers.tf
- vars-optional.tf
- vars-required.tf
- versions.tf
Applying Updates
Section titled “Applying Updates”You can now run tofu apply
with no changes (don’t worry, you’ll get a chance to confirm you want to proceed before you have to commit to anything).
# live
$ tofu apply...Plan: 0 to add, 0 to change, 0 to destroy....
Do you want to perform these actions? OpenTofu will perform the actions described above. Only 'yes' will be accepted to approve.
Enter a value:
Trade-Offs
Section titled “Trade-Offs”Before moving on to the next step, where we’ll duplicate our entire infrastructure estate to introduce a new development environment, it’s important to pause here and evaluate the trade-offs of this refactor.
Both the infrastructure in step 1 and step 2 provisioned the exact same infrastructure (remember that there were 0 to add, 0 to change, 0 to destroy.
). In fact, with the exception of the next step where we introduce the new dev
environment, every step will result in the exact same infrastructure being provisioned.
Why then is this refactor valuable? What do we gain by refactoring our IaC like this? What do we trade away in exchange?
- Abstraction by encapsulation. Instead of one large set of variables that could be used by any resource, or one large set of resources that could interact in ways that are difficult to understand, there are modules that encapsulate subsets of infrastructure so that they have explicit interfaces via variables and outputs.
- More code reusability. Each of these modules can be reused in
live
infrastructure or in othercatalog
modules (which we’ll see in the next step).
- Increased complexity. Instead of one self-contained directory with files directly defining resources to be provisioned, there’s a layer of indirection via modules. As someone consuming the module, you have to either trust it has been authored well (and that it’s well documented, tested, etc.) or vet the module yourself.
- State Adjustment. State manipulation or resource recreation is required to migrate to this pattern.
Every subsequent stage is going to continue incurring trade-offs. You (or someone experienced you trust) must to decide whether these trade-offs are appropriate for your organization and your infrastructure estate.
Wrap Up
Section titled “Wrap Up”This was a significant refactoring step. You’ve transformed your flat configuration into a set of distinct, reusable modules, each with a well-defined API of variables and outputs.
The most critical lesson here was mastering the moved
block. This powerful feature allowed you to completely reorganize your code’s structure without OpenTofu needing to destroy and recreate your existing infrastructure, a vital skill for managing real-world infrastructure. While this adds a layer of indirection, the trade-off is greater code reusability and clearer separation of concerns. With this new modular structure, you’re now perfectly positioned to create a second environment with ease.