Using Terraform to Deploy Templates to VMs in Proxmox

In this post I am going to provide the code necessary and instructions to deploy multiple VMs into a single Proxmox node in your cluster. You can copy and paste the code into your own new files or you can clone my Git repository here. Make sure that you edit the files to match your environment.

This is another post that was inspired by Austin at AustinsNerdyThings.com … Thanks Austin.

Terraform is an IaC (Infrastructure as Code) tool that is fairly well the de facto standard for IaC deployments. It is very light weight and uses TF files written in HCL (Hashicorp Configuration Language) which is a JSON variant and not impossible to learn as needed. To learn this one on the fly I recommend brushing up on the terminology, products, and capabilities. Once you know that then you know how to search for your answers.

Without further ado … let’s get to playing. There are a total of 3 files you are going to need to create if you aren’t cloning my repository.

  • variables.tf
  • provider.tf
  • main.tf

Truthfully, you can have all of these in a single file or any names you would like. The only requirement is that they end with .tf and they all reside in the same directory. Terraform compiles all .tf files in a directory to create a single .tf in memory. How you name them or how you separate things is completely up to you. The key is consistency across the board. Consistency thwarts confusion and could save future you a ton of time researching what you did or how you did it.

The first file is variables.tf below. This file contains all variables that are used for the plan and can be used anywhere in the plan. Terraform supports several different variable types for input and output. See here for more information.

variables.tf

variable "pm_user" {
  description = "The username for the proxmox user"
  type        = string
  sensitive   = false
  default     = "YOUR-PM-USER"

}
variable "pm_password" {
  description = "The password for the proxmox user"
  type        = string
  sensitive   = true
  default     = "YOUR-PM-PASSWORD"
}

variable "pm_tls_insecure" {
  description = "Set to true to ignore certificate errors"
  type        = bool
  default     = true
}

variable "pm_host" {
  description = "The hostname or IP of the proxmox server"
  type        = string
  default     = "YOUR-PM-HOST"
}

variable "pm_node_name" {
  description = "name of the proxmox node to create the VMs on"
  type        = string
  default     = "YOUR-PM-NODENAME"
}

variable "pvt_key" {
  description = "private key file"
  type        = string
  default     = "none"
}

variable "num_masters" {
  description = "Enter the number of Master VMs you want"
  default = X
}

variable "num_masters_mem" {
  description = "Enter the value for the amount of RAM for your masters. ie. 4096"
  default = "XXXX"
}

variable "master_disk_size" {
  description = "Enter the size of your Master node disks ie. 80G"
  type        = string
  default     = "YOUR-MASTER-DISK-SIZE"
}

variable "master_disk_type" {
  description = "What interface type are you using? ie. scsi"
  type        = string
  default     = "YOUR-MASTER-DISK-SIZE"
}

variable "master_disk_location" {
  description = "Where do you want to store the disk on your host? ie. zfs-mirror, local, local-lvm, etc."
  type        = string
  default     = "YOUR-MASTER-DISK-LOCATION"
}

variable "num_nodes" {
  description = "Enter the number of VMs you want for worker nodes."
  default = X
}

variable "num_nodes_mem" {
  description = "Enter the value for the amount of RAM for your worker nodes. ie. 2048"
  default = "xxxx"
}

variable "node_disk_size" {
  description = "Enter the size of your Master node disks ie. 80G"
  type        = string
  default     = "YOUR-NODE-DISK-SIZE"
}

variable "node_disk_type" {
  description = "What interface type are you using? ie. scsi"
  type        = string
  default     = "YOUR-NODE-DISK-SIZE"
}

variable "node_disk_location" {
  description = "Where do you want to store the disk on your host? ie. zfs-mirror, local, local-lvm, etc."
  type        = string
  default     = "YOUR-NODE-DISK-LOCATION"
}

variable "template_vm_name" {
  description = "Name of the template VM"
  type        = string
  default     = "YOUR-PM-VM-TEMPLATE-NAME"
}

variable "master_ips" {
  description = "List of ip addresses for master nodes"
  type        = list(string)
  default     = [
    "a.a.a.a",
    "b.b.b.b",
    "c.c.c.c",
  ]
}

variable "worker_ips" {
  description = "List of ip addresses for worker nodes"
  type        = list(string)
  default     = [  
    "m.m.m.m",
    "n.n.n.n",
    "o.o.o.o",
    "p.p.p.p",
    "q.q.q.q",
  ]
}

variable "networkrange" {
  description = "Enter as 8,16,22,24,etc. hint: 10.0.0.0/8"
  default = YOUR-NET-NUM
}

variable "gateway" {
  description = "Enter your network gateway."
  default = "YOUR-GATEWAY"
}

I have made sure that each variable description makes sense for what you need to input. For the IP variables you will want to make sure to replace the placeholders with legit IPs that are available on your environment. The number of IP entries in each variable definition is directly related to the number of master and worker nodes you define.

When you set the PM user and password variables you will want to either create a special user in your cluster, utilize your root account or utilize API access. Any of these methods is completely acceptable. My lab is not exposed to the outside world so for this example I use user/pass authentication. I will do another post at a later date covering authentication methods but that is beyond the scope of this post.

The next file we need to populate is our provider.tf file. This file is where I keep all information about the terraform provider.

provider.tf

terraform {
  required_providers {
    proxmox = {
      source  = "telmate/proxmox"
      version = ">=2.8.0"
    }
  }
}

provider "proxmox" {
  pm_api_url      = "https://${var.pm_host}:8006/api2/json"
  pm_user         = var.pm_user
  pm_password     = var.pm_password
  pm_tls_insecure = var.pm_tls_insecure
  pm_parallel     = 10
  pm_timeout      = 600
  #  pm_debug = true
  pm_log_enable = true
  pm_log_file   = "terraform-plugin-proxmox.log"
  pm_log_levels = {
    _default    = "debug"
    _capturelog = ""
  }
}

Our final file is our main.tf and this is the file where all of the magic happens.

We are creating 2 different resources in our deployment. We are deploying master nodes and worker nodes with each type having their own unique resource needs. Because we define counts for master and worker nodes Terraform knows to cycle through the 2 resource types and how many times.

main.tf

resource "proxmox_vm_qemu" "proxmox_vm_master" {
  count       = var.num_masters
  name        = "master-${count.index}"
  target_node = var.pm_node_name
  clone       = var.template_vm_name
  os_type     = "cloud-init"
  agent       = 1
  memory      = var.num_masters_mem
  cores       = 4
  disk {
    slot = 0
    size = var.master_disk_size
    type = var.master_disk_type
    storage = var.master_disk_location
    iothread = 1
  }
  ipconfig0 = "ip=${var.master_ips[count.index]}/${var.networkrange},gw=${var.gateway}"

  lifecycle {
    ignore_changes = [
      ciuser,
      sshkeys,
      network
    ]
  }

}

resource "proxmox_vm_qemu" "proxmox_vm_workers" {
  count       = var.num_nodes
  name        = "worker-${count.index}"
  target_node = var.pm_node_name
  clone       = var.template_vm_name
  os_type     = "cloud-init"
  agent       = 1
  memory      = var.num_nodes_mem
  cores       = 2
  disk {
    slot = 0
    size = var.node_disk_size
    type = var.node_disk_type
    storage = var.node_disk_location
    iothread = 1
  }
  ipconfig0 = "ip=${var.worker_ips[count.index]}/${var.networkrange},gw=${var.gateway}"

  lifecycle {
    ignore_changes = [
      ciuser,
      sshkeys,
      network
    ]
  }

}

Now that you have your files it is time to set some values for our variables. If you are following along with the Kubernetes project, then you will want to define the following as well as the other variables:

  • master nodes: 3
  • worker nodes: 5
  • master node mem: 4096
  • worker node mem: 2048
  • master node disk size: 80G
  • worker node disk size: 20G
  • master/node disk type: set to the same as your template
  • master/node disk location: enter your preferred storage location in your cluster
  • vm template name: set to the name of the template you want to use
  • set IPs for master and worker nodes
  • set network and gateway

Once you are sure that you have everything set proper you are ready to advance to the next step.

To use this plan you must first initialize Terraform with:

terraform init

If you get no errors then you are ready to move to the next step:

terraform plan

Again, if you get no errors you are ready to proceed to the final step. If you do see errors then you will need to verify that you have set all variables appropriately and that your file naming convention follows the standard name.tf format.

terraform apply

A ton of text will scroll by for you to inspect one more time before you actually apply. If everything looks good, then type yes and hit enter.

Once the apply process completes you should see the following output. Terraform has the ability, that I will demonstrate in another post, to output information about the resources it has created to a file using templates. This will let you build your inventory file for Ansible playbooks.

With everything going well you should now see this (or similar) in your node.


Now let’s move onto a few subtopics. The first being … what happens when an apply fails? Depending on your environment you might experience a few bottlenecks and messages are delayed. When that happens Terraform sees the VM in an existing state but no record of the completion.

To correct this you can simply re-execute terraform apply. Terraform knows which VMs in the sequence did not build correctly and will destroy the existing failed resources and redeploy them. After applying again you should see success.


There are times that you will have the VMs in such and unstable state that the best thing to do is wipe it out. So, if you would like to completely destroy the entire plan. That is yet another simple one-liner.

terraform destroy

Answer yes to the prompt and within a few seconds all is gone.


So now let’s cover the last 2 items for this post. Adding and removing master and worker nodes aka scaling. The command for this is simply terraform apply. In these images you will see that I have modified my variable file to reflect that I want to scale up to 4 master nodes and 7 worker nodes. That means that I have set the counts to the new count and have added the necessary IPs into the appropriate variables.

terraform plan
terraform apply
final output
Your node should now look like this.

So now, let’s suppose that our load has decreased, and we want to scale back our environment to conserve resources. In a cloud environment that equates to $’s saved. We are now going to scale back to 2 master nodes and 3 worker nodes.

terraform plan
terraform apply
Final output
And now your node should look like this.

I have loosely timed the entire process from initial terraform init through scaling up and scaling back down takes a total of roughly 15 minutes. Most of that time is due to me entering each command and modifying each of the variables as needed. The next beauty of this, and a topic for a future post, is that all of this can be automated with a CI/CD tool such as Jenkins.

I hope that you have found this post to be helpful in your journey. Thank you for reading.

Leave a Reply