Terraform-Based Automation for AWS EKS Cluster and VPC Deployment

20 min readJun 20, 2024

Introduction

As cloud-native applications become increasingly complex, managing infrastructure manually can be a daunting task.

Automation tools like Terraform offer a powerful solution, enabling us to define and provision our infrastructure as code.

In this project, I’ll walk you through the process of setting up an Amazon Elastic Kubernetes Service (EKS) cluster with autoscaling self-managed nodes and a custom Virtual Private Cloud (VPC) using Terraform.

We’ll cover:

Creating a VPC tailored for our EKS cluster.
Configuring security groups to control network traffic.
Setting up the EKS cluster for container orchestration.
Implementing autoscaling for worker nodes to handle varying workloads.

By the end of this guide, you’ll have a scalable, secure Kubernetes environment on AWS, ready to deploy your applications. Let’s dive in!

Feel free to connect with me on LinkedIn to discuss this post, or ask any questions.

Prerequisites

Before diving into setting up your EKS cluster with Terraform, ensure you have the following prerequisites in place:

AWS Account: An active AWS account is essential. If you don’t have one, you can sign up at aws.amazon.com.
AWS CLI: The AWS Command Line Interface (CLI) should be installed and configured on your local machine. You can follow the installation guide here. Ensure you’ve run aws configure to set up your credentials.
Terraform: Install Terraform on your local machine. The installation instructions can be found here.
kubectl: The Kubernetes command-line tool, kubectl, is required to interact with your EKS cluster. Install it following the instructions here.
Basic Knowledge of Kubernetes and Terraform: Familiarity with Kubernetes concepts and Terraform’s configuration syntax will help you follow along with the steps in this guide.

Once you have all these prerequisites set up, you’re ready to start creating your infrastructure.

Let’s move on to setting up the project structure.

Project Setup

Let’s set up the project structure and create the necessary Terraform files. Follow these steps to organize your project directory:

1: Create Project Directory: Start by creating a new directory for your Terraform project.

mkdir eks-terraform
cd eks-terraform

2: Create Terraform Files: Inside the eks-terraform directory, create the following files:

eks-cluster.tf

Define the EKS cluster and node group resources.

outputs.tf

Define the outputs to display important information after deployment.

security-group.tf

Define the security groups for the EKS control plane and worker nodes.

variables.tf

Define the variables used in your Terraform configuration.

vpc.tf

Define the VPC and subnet resources.

version.tf

This file specifies the required Terraform version and provider details.

.ignore

Specify files and directories to ignore in version control.

Summary

Your project directory should now look like this:

eks-terraform/
├── eks-cluster.tf
├── outputs.tf
├── security-group.tf
├── variables.tf
├── vpc.tf
├── version.tf
└── .gitignore

EKS Cluster Creation

In this section, we’ll use Terraform to create the EKS cluster.

We will be leveraging a pre-built Terraform module from the Terraform Registry to simplify the process.

Terraform modules are self-contained packages of Terraform configuration files that are managed together.

They allow you to encapsulate and reuse resource configurations across different parts of your infrastructure.

Using modules has several advantages such as reusability, maintainability, consistency, and abstraction.

Inside the eks-cluster.tf file, paste this code:

module "eks" {
  source          = "terraform-aws-modules/eks/aws"
  version         = "20.8.4"
  cluster_name    = local.cluster_name
  cluster_version = var.kubernetes_version
  subnet_ids      = module.vpc.private_subnets

  enable_irsa = true

  tags = {
    cluster = "demo"
  }

  vpc_id = module.vpc.vpc_id

  eks_managed_node_group_defaults = {
    ami_type               = "AL2_x86_64"
    instance_types         = ["t3.medium"]
    vpc_security_group_ids = [aws_security_group.all_worker_mgmt.id]
  }

  eks_managed_node_groups = {
    node_group = {
      min_size     = 2
      max_size     = 6
      desired_size = 2
    }
  }
}

Explanation of the Code:

1: Module Source and Version:

source = "terraform-aws-modules/eks/aws"
version = "20.8.4"

This specifies the source of the module and the version to use.

The terraform-aws-modules/eks/aws module from the Terraform Registry simplifies the creation of EKS clusters.

Version 20.8.4 is specified to ensure compatibility and stability.

2: Cluster Name and Version:

cluster_name    = local.cluster_name
cluster_version = var.kubernetes_version

local.cluster_name: This local variable holds the name of the EKS cluster.
var.kubernetes_version: This variable specifies the Kubernetes version to be used in the EKS cluster.

3: Subnets and VPC ID:

subnet_ids = module.vpc.private_subnets
vpc_id = module.vpc.vpc_id

module.vpc.private_subnets: This references the private subnets created in the VPC module.
module.vpc.vpc_id: This references the VPC ID created in the VPC module.

4: IRSA (IAM Roles for Service Accounts):

enable_irsa = true

This enables IAM Roles for Service Accounts (IRSA), which allows for fine-grained access control on AWS resources for applications running on EKS.

5: Tags:

tags = {
  cluster = "demo"
}

This adds tags to the EKS cluster for identification and management.

6: Managed Node Group Defaults:

eks_managed_node_group_defaults = {
  ami_type               = "AL2_x86_64"
  instance_types         = ["t3.medium"]
  vpc_security_group_ids = [aws_security_group.all_worker_mgmt.id]
}

This block sets default configurations for EKS managed node groups:

ami_type: Specifies the Amazon Linux 2 AMI for the nodes.
instance_types: Specifies the instance type for the nodes (e.g., t3.medium).
vpc_security_group_ids: Associates the worker nodes with the specified security group.

7: Managed Node Groups:

eks_managed_node_groups = {
  node_group = {
    min_size     = 2
    max_size     = 6
    desired_size = 2
  }
}

This block defines the managed node group configuration:

min_size: Minimum number of nodes in the node group.
max_size: Maximum number of nodes in the node group.
desired_size: Desired number of nodes in the node group.

VPC Creation

A Virtual Private Cloud (VPC) is a virtual network environment within a cloud provider like AWS that allows you to logically isolate and manage your resources securely.

It provides a customizable network configuration, including IP address ranges, subnets, route tables, and security settings, enabling you to create a virtual network infrastructure that closely resembles a traditional data center.

In this step, we create a VPC using Terraform. The VPC is essential for isolating and controlling network resources for your Amazon EKS cluster.

Here’s the provided code for vpc.tf, followed by a detailed explanation of each part:

provider "aws" {
  region = var.aws_region
}

data "aws_availability_zones" "available" {}

locals {
  cluster_name = "abhi-eks-${random_string.suffix.result}"
}

resource "random_string" "suffix" {
  length  = 8
  special = false
}

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.7.0"

  name                 = "abhi-eks-vpc"
  cidr                 = var.vpc_cidr
  azs                  = data.aws_availability_zones.available.names
  private_subnets      = ["10.0.1.0/24", "10.0.2.0/24"]
  public_subnets       = ["10.0.4.0/24", "10.0.5.0/24"]
  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
  }

  public_subnet_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/elb"                      = "1"
  }

  private_subnet_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/internal-elb"             = "1"
  }
}

Explanation of the code:

1: AWS Provider:

provider "aws" {
  region = var.aws_region
}

provider “aws”: Specifies that we are using the AWS provider and sets the region using a variable var.aws_region.

2: Data Source for Availability Zones:

data "aws_availability_zones" "available" {}

data “aws_availability_zones” “available”: Fetches the available AWS availability zones in the specified region.

3: Local Values

locals {
  cluster_name = "abhi-eks-${random_string.suffix.result}"
}

locals: Defines local values. Here, it creates a unique cluster_name by appending a random string suffix.

4: Random String Resource

resource "random_string" "suffix" {
  length  = 8
  special = false
}

resource “random_string” “suffix”: Generates a random string of 8 characters, excluding special characters, used to ensure unique resource names.

5: VPC Module

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.7.0"

source: Specifies the source of the module terraform-aws-modules/vpc/aws, which is located in the Terraform Registry.
version: Specifies the version of the module to use (5.7.0). This ensures consistency and compatibility with the module's features and functionality.

  name                 = "abhi-eks-vpc"
  cidr                 = var.vpc_cidr
  azs                  = data.aws_availability_zones.available.names

name: Sets the name of the VPC to "abhi-eks-vpc".
cidr: Specifies the CIDR block for the VPC, retrieved from the var.vpc_cidr variable. This defines the IP address range for the VPC.
azs: Specifies the availability zones (AZs) where subnets will be created. This data is fetched using data.aws_availability_zones.available.names, which lists all available AZs in the specified region.

  private_subnets      = ["10.0.1.0/24", "10.0.2.0/24"]
  public_subnets       = ["10.0.4.0/24", "10.0.5.0/24"]

private_subnets: Defines the CIDR blocks for private subnets within the VPC. Private subnets are typically used for resources that do not need direct access to the internet.
public_subnets: Defines the CIDR blocks for public subnets within the VPC. Public subnets are used for resources that require internet access, such as load balancers or web servers.

  enable_nat_gateway   = true
  single_nat_gateway   = true

enable_nat_gateway : Enables a NAT gateway for outbound internet traffic from private subnets.

single_nat_gateway : Ensures that all private subnets share a single NAT gateway to minimize costs.

  enable_dns_hostnames = true
  enable_dns_support   = true

enable_dns_hostnames and enable_dns_support: Enable DNS hostnames and DNS resolution within the VPC, allowing instances in the VPC to resolve public DNS hostnames and receive DNS requests from the VPC itself.

6: Tags

tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
  }

tags: Tags are metadata assigned to AWS resources for organization and identification purposes. Here, tags are applied to the VPC itself, indicating it belongs to the Kubernetes cluster named ${local.cluster_name}.

public_subnet_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/elb"                      = "1"
  }

  private_subnet_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/internal-elb"             = "1"
  }
}

public_subnet_tags and private_subnet_tags: Tags applied specifically to public and private subnets, respectively. These tags help in identifying subnet roles within the Kubernetes cluster environment.

Security Group Creation

Security groups are virtual firewalls that control inbound and outbound traffic for instances. They act as a fundamental component of network security, providing traffic control, stateful filtering, instance-level protection and flexibility & scalability.

Ingress Rule: Allows inbound traffic from the specified CIDR blocks, which are private IP address ranges commonly used within networks.
Egress Rule: Permits outbound traffic to any destination (0.0.0.0/0), allowing instances associated with the security group to communicate freely with the internet and other resources.

Here’s the Terraform code from security-groups.tf that creates an AWS security group and associated rules:

resource "aws_security_group" "all_worker_mgmt" {
  name_prefix = "all_worker_management"
  vpc_id      = module.vpc.vpc_id
}

resource "aws_security_group_rule" "all_worker_mgmt_ingress" {
  description       = "allow inbound traffic from eks"
  from_port         = 0
  protocol          = "-1"
  to_port           = 0
  security_group_id = aws_security_group.all_worker_mgmt.id
  type              = "ingress"
  cidr_blocks       = [
    "10.0.0.0/8",
    "172.16.0.0/12",
    "192.168.0.0/16",
  ]
}

resource "aws_security_group_rule" "all_worker_mgmt_egress" {
  description       = "allow outbound traffic to anywhere"
  from_port         = 0
  protocol          = "-1"
  security_group_id = aws_security_group.all_worker_mgmt.id
  to_port           = 0
  type              = "egress"
  cidr_blocks       = ["0.0.0.0/0"]
}

Explanation of the Code

1: aws_security_group Resource

aws_security_group.all_worker_mgmt: Defines an AWS security group named all_worker_management within the VPC specified by module.vpc.vpc_id. This security group will govern inbound and outbound traffic rules for associated instances.

2: aws_security_group_rule Resources

aws_security_group_rule.all_worker_mgmt_ingress: Specifies an ingress (inbound) rule for the all_worker_mgmt security group. It allows traffic from specified CIDR blocks (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) on any protocol (-1) and port (0).
aws_security_group_rule.all_worker_mgmt_egress: Defines an egress (outbound) rule for the all_worker_mgmt security group. It permits all outbound traffic (0.0.0.0/0) on any protocol (-1) and port (0).

These security group rules ensure that instances within the associated VPC can communicate securely based on specified network traffic requirements.

Configure `variables.tf`

The variables.tf file defines the input variables that can be used to customize the Terraform configuration.

Here’s the provided code with a brief explanation of each variable:

variable "kubernetes_version" {
  default     = 1.27
  description = "Kubernetes version"
}

variable "vpc_cidr" {
  default     = "10.0.0.0/16"
  description = "Default CIDR range of the VPC"
}

variable "aws_region" {
  default     = "us-west-1"
  description = "AWS region"
}

Explanation of the Variables:

1:kubernetes_version

Purpose: Specifies the version of Kubernetes to use for the EKS cluster.
Default Value: 1.27
Description: This variable allows you to specify which version of Kubernetes will be deployed in the EKS cluster. Keeping Kubernetes versions up-to-date ensures compatibility with the latest features and security patches.

2: vpc_cidr

Purpose: Defines the CIDR (Classless Inter-Domain Routing) block for the VPC.
Default Value: 10.0.0.0/16
Description: This variable sets the IP address range for the VPC. The 10.0.0.0/16 range provides 65,536 IP addresses, which is typically sufficient for most use cases. This range can be adjusted based on your network requirements

3: aws_region

Purpose: Specifies the AWS region where the infrastructure will be deployed.
Default Value: us-west-1
Description: This variable determines the AWS region for deploying your resources. Regions are geographic locations that allow you to place resources closer to your users or meet specific legal and regulatory requirements.

By defining these variables, you can easily modify your configuration without changing the actual Terraform code, promoting reusability and flexibility.

Configure outputs`.tf`

The outputs.tf file defines the outputs of your Terraform configuration, which are useful for obtaining information about the created resources. Here's the provided code with a brief explanation of each output:

output "cluster_id" {
  description = "EKS cluster ID."
  value       = module.eks.cluster_id
}

output "cluster_endpoint" {
  description = "Endpoint for EKS control plane."
  value       = module.eks.cluster_endpoint
}

output "cluster_security_group_id" {
  description = "Security group ids attached to the cluster control plane."
  value       = module.eks.cluster_security_group_id
}

output "region" {
  description = "AWS region"
  value       = var.aws_region
}

output "oidc_provider_arn" {
  value = module.eks.oidc_provider_arn
}

output "zz_update_kubeconfig_command" {
  value = "aws eks update-kubeconfig --name " + module.eks.cluster_id
  value = format("%s %s %s %s", "aws eks update-kubeconfig --name", module.eks.cluster_id, "--region", var.aws_region)
}

Explanation of Outputs

cluster_id

Description: The ID of the EKS cluster.
Value: Retrieved from module.eks.cluster_id.
Purpose: This output provides the unique identifier for the EKS cluster, which is essential for managing and referencing the cluster.

2:cluster_endpoint

Description: The endpoint for the EKS control plane.
Value: Retrieved from module.eks.cluster_endpoint.
Purpose: This output provides the URL endpoint to interact with the EKS control plane. It’s necessary for configuring kubectl and other Kubernetes tools to communicate with the cluster.

3: cluster_security_group_id

Description: The security group IDs attached to the cluster control plane.
Value: Retrieved from module.eks.cluster_security_group_id.
Purpose: This output provides the security group IDs, which are crucial for managing network access to the EKS control plane.

4: region

Description: The AWS region where the infrastructure is deployed.
Value: Retrieved from var.aws_region.
Purpose: This output confirms the AWS region used for the deployment, which can be useful for cross-referencing and managing resources in a multi-region setup.

5: oidc_provider_arn

Description: The ARN of the OIDC provider associated with the EKS cluster.
Value: Retrieved from module.eks.oidc_provider_arn.
Purpose: This output provides the Amazon Resource Name (ARN) for the OIDC provider, used for integrating IAM roles with Kubernetes service accounts for fine-grained access control.

6: zz_update_kubeconfig_command

Description: A command to update the kubeconfig file with the EKS cluster configuration.
Value: Constructs the command using the cluster ID and AWS region.
Purpose: Although commented out, this output provides a ready-to-use command for updating the local kubeconfig file to interact with the EKS cluster. Uncomment and use it for convenience.

By defining these outputs, you make it easier to access and utilize key details of your infrastructure, facilitating management and integration with other tools and services.

Configure versions`.tf`

The versions.tf file sets the required versions for Terraform and the necessary providers.

This ensures compatibility and stability across your Terraform project. Here’s a breakdown of the provided code:

terraform {
  required_version = ">= 0.12"
  required_providers {
    random = {
      source  = "hashicorp/random"
      version = "~> 3.1.0"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = ">=2.7.1"
    }
    aws = {
      source  = "hashicorp/aws"
      version = ">= 3.68.0"
    }
    local = {
      source  = "hashicorp/local"
      version = "~> 2.1.0"
    }
    null = {
      source  = "hashicorp/null"
      version = "~> 3.1.0"
    }
    cloudinit = {
      source  = "hashicorp/cloudinit"
      version = "~> 2.2.0"
    }
  }
}

Explanation of the version.tf

1: required_version

Purpose: Specify the minimum version of Terraform required to run the configuration.
Value: >= 0.12 ensures compatibility with Terraform version 0.12 and newer. Terraform 0.12 introduced many new features and improvements, so this ensures those features are available.

2: required_providers

Purpose: Defines the required providers and their versions for the configuration.

Providers:

random:

Source: hashicorp/random
Version: ~> 3.1.0 (compatible with 3.x versions, starting from 3.1.0)

kubernetes:

Source: hashicorp/kubernetes
Version: >= 2.7.1 (version 2.7.1 or newer)

aws:

Source: hashicorp/aws
Version: >= 3.68.0 (version 3.68.0 or newer)

local:

Source: hashicorp/local
Version: ~> 2.1.0 (compatible with 2.x versions, starting from 2.1.0)

null:

Source: hashicorp/null
Version: ~> 3.1.0 (compatible with 3.x versions, starting from 3.1.0)

cloudinit:

Source: hashicorp/cloudinit
Version: ~> 2.2.0 (compatible with 2.x versions, starting from 2.2.0)

Backend Configuration (Optional)

If you are using a remote backend (e.g., S3 for state files), you should configure it in a backend.tf file.

A remote backend in Terraform is a way to store the Terraform state file remotely, rather than locally on your machine.

This approach has several advantages, especially when working in a team or when managing large-scale infrastructure.

Benefits of Using a Remote Backend

State Management:

Centralized State: Keeps the Terraform state file in a central location, enabling multiple team members to work on the same infrastructure without conflicts.
State Locking: Prevents simultaneous updates to the state file, reducing the risk of corruption. Supports features like AWS S3 with DynamoDB for state locking.

Collaboration:

Team Collaboration: Allows team members to share and manage infrastructure state efficiently, with everyone accessing the latest state.
Versioning and History: Maintains a history of state changes, allowing rollbacks to previous states if needed.

Security:

Secure Storage: Stores state files securely in solutions like AWS S3, with encryption and access controls.
Backup: Provides reliable backup for your state files.

Scalability:

Scalable Management: Makes it easier to manage larger and more complex infrastructures by ensuring state consistency and reliability.

Inside the backend.tf file paste this code:

terraform {
  backend "s3" {
    bucket         = "my-terraform-state-bucket"
    key            = "path/to/my/key"
    region         = "us-west-1"
    dynamodb_table = "my-lock-table"
    encrypt        = true
  }
}

bucket: The name of the S3 bucket where the state file will be stored.
key: The path within the bucket to the state file.
region: The AWS region where the S3 bucket is located.
dynamodb_table: The DynamoDB table used for state locking.
encrypt: Ensures that the state file is encrypted in S3.

DynamoDB Table Configuration (Only if you have implemented Backend Configuration).

Amazon DynamoDB is a fully managed NoSQL database service provided by AWS, designed to deliver low-latency performance at any scale.

In addition to its typical use cases, a DynamoDB table is often utilized in AWS environments as part of Terraform’s remote backend configuration.

Specifically, it plays a critical role in state locking, ensuring that only one Terraform process can modify the state file at any given time.

This mechanism prevents concurrent modifications and maintains the integrity of infrastructure deployments managed by Terraform.

Paste this code inside the dynamodb.tf file:

resource "aws_dynamodb_table" "terraform_lock" {
  name         = "my-lock-table"  // Match the name used in backend.tf
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"

  attribute {
    name = "LockID"
    type = "S"
  }

  tags = {
    Name = "my-lock-table"  // Tag the DynamoDB table for identification
  }
}

Explanation of the DynamoDB Table

name = "my-lock-table": Specifies the name of the DynamoDB table as "my-lock-table", matching the name used in backend.tf for dynamodb_table.
billing_mode = "PAY_PER_REQUEST": Sets the billing mode to "PAY_PER_REQUEST", which is suitable for on-demand usage.
hash_key = "LockID": Defines "LockID" as the hash key attribute for the DynamoDB table.
attribute {}: Defines the attribute "LockID" as a string ("S" type) to be used as the hash key.
tags = { Name = "my-lock-table" }: Tags the DynamoDB table with a name for identification purposes, ensuring clarity in your AWS resources.

Default State Management in Terraform

When using Terraform, managing state files is crucial for tracking the state of your infrastructure resources and ensuring consistency across deployments. Terraform offers flexibility in how state files are managed:

Remote Backend and DynamoDB Configuration

If you configure a remote backend, such as AWS S3 with DynamoDB for state locking:

State Locking: Ensures that only one Terraform process can modify the state file at a time, preventing concurrent modifications and maintaining infrastructure integrity.
High Availability: Provides a centralized location for storing state files, accessible to team members for collaborative infrastructure management.

Default State Management

If you do not configure a remote backend and DynamoDB table:

Local State Files: Terraform automatically manages state files locally in a file named terraform.tfstate.
Single User Operations: Suitable for individual use or small teams where concurrent modifications are not a concern.
Limitations: May lead to conflicts and state file corruption in collaborative environments or complex infrastructure setups.

Initialize and Validate Configuration

Before applying your Terraform configuration to provision or modify infrastructure, it’s essential to initialize and validate your Terraform configuration files.

This step ensures that your environment is properly set up and that the configuration is syntactically correct and ready for deployment.

Initialize Terraform

To initialize Terraform, you run the following command in your Terraform project directory:

terraform init

Explanation:

Initialization: Downloads the necessary providers and modules specified in your configuration files (provider.tf, versions.tf), setting up your environment for Terraform operations.
Backend Configuration: If using a remote backend (backend.tf), initializes connection settings like AWS S3 bucket and DynamoDB table for state storage and locking.

2. Validate Configuration

After initialization, validate your Terraform configuration files using:

terraform validate

Explanation:

Validation: Checks the syntax and structure of your Terraform configuration files (*.tf files) to ensure they are correctly formatted and compliant with Terraform's expected syntax.
Errors and Warnings: Identifies any errors or warnings in your configuration that could prevent successful deployment.

Finalize Configuration

After initializing and validating your Terraform configuration, the typical next steps involve planning, applying changes, and managing your infrastructure.

1: Plan Changes

terraform plan

Purpose: Generates an execution plan showing what Terraform will do when you apply your configuration.
Review: Review the plan to understand the actions Terraform will take, such as creating, modifying, or deleting resources.
Validation: Ensures your configuration is correct and aligns with your intentions before making changes to your infrastructure.

2: Apply Changes

terraform apply

Execution: Applies the changes defined in your configuration to your infrastructure.
Interactive: Confirms changes before applying unless automated with -auto-approve.
State Management: Updates the Terraform state file (terraform.tfstate or remote backend) to reflect the current state of deployed resources.

3: Review Outputs

terraform output

Outputs: Displays any outputs defined in output.tf, such as resource IDs, endpoints, or other relevant information.
Verification: Use outputs to verify successful deployment and gather necessary information for further operations or integrations.

4: Manage State

terraform state

State Management: Provides commands to manage Terraform state, such as listing resources, moving states, or deleting resources from the state.
Maintenance: Useful for troubleshooting, state file maintenance, or recovering from errors.

Managing Terraform Configurations with .gitignore

When using Terraform to manage infrastructure as code, it’s essential to properly manage your project’s version control.

One critical aspect of this is the .gitignore file, which helps ensure that sensitive or unnecessary files are not included in your Git repository.

Importance of .gitignore

Sensitive Information: Excludes sensitive data such as credentials, API keys, and state files (terraform.tfstate or .terraform/) from being tracked by version control.
Temporary Files: Ignores temporary files, logs, and artifacts generated during Terraform operations, reducing repository clutter.
Enhanced Security: Protects sensitive information from being exposed inadvertently in version control history or during collaboration.

Paste this code inside the .gitignore file:

# Local .terraform directories
**/.terraform/*

# .tfstate files
*.tfstate
*.tfstate.*

# Crash log files
crash.log
crash.*.log

# Exclude all .tfvars files, which are likely to contain sensitive data, such as
# password, private keys, and other secrets. These should not be part of version 
# control as they are data points which are potentially sensitive and subject 
# to change depending on the environment.
*.tfvars
*.tfvars.json

# Ignore override files as they are usually used to override resources locally and so
# are not checked in
override.tf
override.tf.json
*_override.tf
*_override.tf.json

# Include override files you do wish to add to version control using negated pattern
# !example_override.tf

# Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan
# example: *tfplan*

# Ignore CLI configuration files
.terraformrc
terraform.rc

Explanation of .gitignore:

1: Local .terraform directories

**/.terraform/*

Excludes any .terraform directories and their contents from being committed to Git. These directories contain local Terraform plugins and modules.

2: .tfstate files

*.tfstate
*.tfstate.*

Ignores Terraform state files (*.tfstate) and any backup or temporary state files (*.tfstate.*) that may be generated during Terraform operations.

3: Crash log files

crash.log
crash.*.log

Excludes crash log files (crash.log and crash.*.log) from being tracked. These files are generated in case of Terraform command crashes or errors.

4: .tfvars files

*.tfvars
*.tfvars.json

Ignores Terraform variable files (*.tfvars and *.tfvars.json) which may contain sensitive data such as passwords, API keys, or other secrets.

5: Override files

override.tf
override.tf.json
*_override.tf
*_override.tf.json

Excludes override files (override.tf, override.tf.json, *_override.tf, *_override.tf.json) typically used to locally override Terraform configurations.

6: Include override files in version control

# Include override files you do wish to add to version control using negated pattern
# !example_override.tf

Provides a comment showing how to include specific override files (!example_override.tf) in version control if needed.

7: Include tfplan files

# Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan
# example: *tfplan*

Comments suggesting to include tfplan files if you want to ignore the plan output generated by terraform plan -out=tfplan.

8: CLI configuration files

.terraformrc
terraform.rc

Ignores CLI configuration files (.terraformrc and terraform.rc) which may contain local or user-specific Terraform configurations.

Validate the resources creation via AWS Management Console (UI)

1. Login to AWS Console

Open your web browser and navigate to the AWS Management Console.
Sign in using your AWS account credentials.

2. Navigate to Each Resource Type

EC2 Instances:

Services: Click on “Services” in the top-left corner.
Compute: Select “EC2” under the Compute section.
Instances: Verify that EC2 instances created by Terraform are listed under “Instances”.
Details: Click on an instance to view its details, including instance type, status, security groups, and tags.

VPC (Virtual Private Cloud):

Networking & Content Delivery: Click on “Services” and then “VPC”.
Your VPCs: Ensure that the VPC created by Terraform is listed under “Your VPCs”.
Subnets: Verify that the subnets (public and private) associated with your VPC are correctly configured.

Security Groups:

Networking & Content Delivery: Click on “Services” and then “EC2”.
Security Groups: Navigate to “Security Groups” under the “Network & Security” section.
Verify: Check that the security groups created by Terraform are listed and configured with the expected inbound and outbound rules.

EKS Cluster:

Compute: Select “Elastic Kubernetes Service” under Compute.
Clusters: Ensure that the EKS cluster created by Terraform is listed under “Clusters”.
Details: Click on the cluster name to review its details, including Kubernetes version, endpoint, node groups, and associated resources.

IAM Roles and Policies:

Security, Identity, & Compliance: Click on “Services” and then “IAM”.
Roles/Policies: Verify that IAM roles and policies created by Terraform are present under “Roles” and “Policies” respectively.
Details: Click on a role or policy to inspect its permissions, trust relationships, and usage.

3. Review and Confirm Configuration

Settings: For each resource type, review its configuration settings to ensure they match what you defined in your Terraform scripts (*.tf files).
Tags: Check that tags applied to resources are correctly assigned and labeled.

4. Functional Testing (Optional)

Interact: If applicable, interact with deployed resources to validate functionality, such as accessing EC2 instances via SSH, accessing Kubernetes cluster via kubectl, or testing network connectivity.

Clean Up Resources

1: Identify Resources

List all resources provisioned by Terraform, including EC2 instances, VPCs, EKS clusters, IAM roles, and any other resources managed through your Terraform scripts.

2: Terminate EC2 Instances

AWS Management Console: Navigate to “Services” > “EC2” > “Instances”.
Select instances created by Terraform, verify their IDs, and terminate them.
CLI: Use aws ec2 terminate-instances --instance-ids <instance-id> to terminate instances via the AWS CLI if preferred.

3: Delete EKS Cluster

AWS Management Console: Navigate to “Services” > “Elastic Kubernetes Service”.
Select the EKS cluster created by Terraform, and choose “Delete” to remove it.
Ensure to delete associated resources such as node groups and any add-ons if applicable.

4: Remove VPC and Subnets

AWS Management Console: Navigate to “Services” > “VPC”.
Delete the VPC and associated subnets created by Terraform.
Remove any remaining resources like route tables, internet gateways, or NAT gateways associated with the VPC.

5: Delete IAM Roles and Policies

AWS Management Console: Navigate to “Services” > “IAM”.
Remove IAM roles and policies created by Terraform that are no longer needed.
Ensure to detach policies from roles before deletion if they are attached.

6: Clean Up Security Groups

AWS Management Console: Navigate to “Services” > “EC2” > “Security Groups”.
Delete any security groups created by Terraform that are no longer in use.

7: Verify Deletion

Double-check the AWS Management Console to ensure all resources have been deleted successfully.
Use the AWS CLI or SDKs to automate resource deletion if handling a large number of resources.

8: State File Management

Remove local Terraform state files (terraform.tfstate and terraform.tfstate.backup) if you're not using a remote backend for state management.

9: Review Billing

Monitor your AWS billing dashboard to ensure that resources have been successfully terminated and to verify that no unexpected charges occur.

Conclusion

This project showcased the effective use of Terraform for automating AWS infrastructure deployments.

By leveraging infrastructure as code principles, we achieved scalability, reliability, and cost efficiency.

Through automated provisioning of resources like EKS clusters, VPCs, and security groups, we ensured consistent deployments while maintaining robust security practices.

Moving forward, Terraform remains pivotal in enabling agile and efficient management of cloud infrastructure, supporting our goal of streamlined operations and optimized resource utilization.

Connect with me

For more insights and future projects, feel free to connect with me on LinkedIn , GitHub and Medium.