Next advance topic GitLab and Terraform

I’ll go over this topic this weekend, as it will also help me refresh my own memory of it.

In earlier topics, we have to manually create the Amazon AWS EC2 cloud instance manually but with Terraform – A tool from HashiCorp that lets us define infrastructure as code and it allows us to automate the provision of this instance automatically through hcl code (HashiCorp Configuration Language) and it is file with .tf extension — for example1: main.tf

resource "aws_instance" "example" {
ami = "ami-123456"
instance_type = "t2.micro"
}

When we run terraform apply, it creates that infrastructure automatically (terraform loads *.tf, read them and execute)

example2: .gitlab-ci.yml

stages:
- validate
- plan
- apply

variables:
TF_ROOT: "./" # location of terraform files
TF_STATE_NAME: "default"
TF_IN_AUTOMATION: "true"

before_script:
- cd $TF_ROOT
- terraform init -input=false

validate:
stage: validate
script:
- terraform validate

plan:
stage: plan
script:
- terraform plan -out=tfplan
artifacts:
paths:
- $TF_ROOT/tfplan

apply:
stage: apply
script:
- terraform apply -auto-approve tfplan
when: manual # requires manual approval

In short, with Terraform, we can use it to define infrastructure as code as seen above so it allows us to automate the infrastructure (AWS, GCP, Azure, etc.) through GitLab Pipelines instead of manual provisioning. It’s very useful since manual processes are typically more prone to errors than automated ones.

e.g we would have a GitLab project that is responsible to automate the infrastructure of the cloud providers like AWS, Google Cloud Platform (GCP), Azure etc. and another GitLab project that is responsible to do the actual software development on that said infrastructure.

And with this automation of the infrastructure of the cloud providers we can quickly provision them through code – hcl code. It is an advance topic.

GitLab CI/CD (.gitlab-ci.yml) Cheatsheet (helpful for junior DevOps)

First of all: What is .gitlab-ci.yml?


The .gitlab-ci.yml file is the configuration file that defines how our GitLab CI/CD (Continuous Integration and Continuous Deployment) pipeline runs.

It lives at the root of our GitLab repository and tells GitLab what jobs to run, in what order, under what conditions, and in what environments.

Purpose


When we push code to GitLab, the .gitlab-ci.yml file triggers pipelines that can automatically:

  • Build our application
  • Run tests
  • Deploy to staging or production
  • Analyze, or package code
  • Notify or perform other automated tasks

How It Works


  1. We commit a .gitlab-ci.yml file to our repository.
  2. GitLab reads it and creates a pipeline — a sequence of stages.
  3. Each stage runs jobs, which execute the commands we define (like npm test, docker build, etc.).
  4. Runners (machines provided by GitLab or our own servers) execute those jobs. e.g I covered setting up self-managed runner here:
Read more:Setup self-managed GitLab runner on AWS ec2 cloud – csforce.de | VIC: GitLab CI/CD (.gitlab-ci.yml) Cheatsheet (helpful for junior DevOps)

Example:

stages:
- build
- test
- deploy

build_app:
stage: build
script:
- echo "Building the app..."
- npm install && npm run build

test_app:
stage: test
script:
- echo "Running tests..."
- npm test

deploy_prod:
stage: deploy
script:
- echo "Deploying to production..."
- ./deploy.sh
only:
- main

Here is what we told GitLab PipeLine what to do per a set of instruction above using its .yml file aka .gitlab-ci.yml:

  • First, build_app runs (stage: build)
  • Then test_app runs (stage: test)
  • Finally, deploy_prod runs (stage: deploy) — but only when we push to the main branch

Here is the cheatsheet for all the keywords recognized by the pipeline and what are their purposes?

Anatomy & Basics

SectionPurpose / Notes
stages:Declare ordered stages (e.g. build, test, deploy). Jobs run by stage order.
variables:Global variables for jobs (unless overridden).
default:Default settings (e.g. image, before_script, cache) applied across jobs.
include:Pull in external YAML files (local, remote, template, project).
workflow:Define rules for when a pipeline should run (e.g. on merges only)

Job Configuration Keywords

Each job is a top-level key (except reserved ones). In a job we can use:

KeywordWhat it does / notes
stage:Assigns the job to one of the declared stages.
script:Required — commands to run. Can be an array or scalar.
image:Docker image to run the job in (if using Docker runner).
services:Additional Docker services (e.g. a database) for that job.
before_script: / after_script:Commands to run before / after script (per-job or inherited).
artifacts:Define files to keep from job (e.g. build outputs). Has subkeys like paths, expire_in, when.
cache:Define files/directories to cache between runs (e.g. dependencies).
dependencies:Jobs whose artifacts this job needs.
needs:More advanced — allows jobs to run out-of-stage order by declaring dependencies.
rules: / only: / except:Conditional logic to include/exclude job runs (by branch, tags, variables, etc.). Use rules: for more flexibility.
when:Defines job triggering: on_success, on_failure, always, manual, delayed.
allow_failure:Whether job failures are allowed without failing the pipeline.
timeout:Max time job can run.
environment:Set deployment environment (name, URL, etc.).
tags:Tags used to select appropriate runner(s).
retry:Retry configuration on failure (how many times, conditions).
parallel:Run multiple instances of the same job in parallel.
resource_group:Limit concurrency for jobs in the same resource group.
extends:Inherit from another job or template (for reuse).

This cheatsheet is helpful esp for engineer who just starts DevOps.

Example:

stages:
- build
- test
- deploy

variables:
APP_ENV: "production"

default:
image: node:18
before_script:
- npm ci

build_job:
stage: build
script:
- npm run build
artifacts:
paths:
- dist/

test_job:
stage: test
script:
- npm test
dependencies:
- build_job

deploy_job:
stage: deploy
script:
- ./deploy.sh
environment:
name: production
url: https://myapp.example.com
when: manual
only:
- main

Setup self-managed GitLab runner on AWS ec2 cloud

To refresh our memory: What is GitLab runner?

Runner is a GitLab component that actually executes our CI/CD pipeline (Continuous Integration (CI)/Continuous Delivery (CD) ).

We define our CI/CD pipeline stage and job using its .gitlab-ci.yml (.yaml) file which defines jobs (like builds, tests, deploys), the runner is what runs those scripts on some compute environment e.g our aws ec2 cloud instance amazon linux.

Example of .gitlab-ci.yml looks like

stages:
- build
- test
- deploy

build-job:
stage: build
script:
- echo "Hello, $GITLAB_USER_LOGIN!"

test-job1:
stage: test
script:
- echo "This job tests something"

test-job2:
stage: test
script:
- echo "This job tests something, but takes more time than test-job1."
- echo "After the echo commands complete, it runs the sleep command for 20 seconds"
- echo "which simulates a test that runs 20 seconds longer than test-job1"
- sleep 20

deploy-prod:
stage: deploy
script:
- echo "This job deploys something from the $CI_COMMIT_BRANCH branch."
environment: production

with .yml above, it is like set of instruction that we tell how GitLab would run the runners for our CI/CD pipeline. Like above, we tell GitLab that the runner would have 3 stages

first stage is build, second stage is test, and third stage is deploy.

then we can add job that linked to the stage by

stage: <stage_name>

stage:
- <stage_name>

<job_name>:
- stage: <stage_name>
- script:
<script_statements>

e.g above Example of .gitlab-ci.yml, we only have one build job, two run jobs (one short, one longer), last but not least, we have one deploy job.

How it works

When we use GitLab SaaS (e.g., our project is hosted at gitlab.com):

  1. We define CI/CD jobs in .gitlab-ci.yml.
  2. GitLab’s shared runners (SaaS runners) automatically pick up our jobs.
  3. They execute them in isolated environments (typically Docker containers).
  4. Results (logs, artifacts, statuses) are sent back to our GitLab project.

But Saas runner is just one type of runners that GitLab supported.

GitLab Runner SaaS refers to GitLab’s managed runners — runners hosted and maintained by GitLab itself.

Types of Runners

TypeDescriptionManaged By
SaaS / Shared RunnerPreconfigured runners available for all projects on GitLab.comGitLab
Group or Project RunnerDedicated to our group or project, still hosted by GitLabGitLab
Self-Managed RunnerInstalled and maintained by us (e.g. on our own VM or cluster)Us

So today, we explore how we can setup and managed our runner. And we will use AWS ec2 cloud Amazon Linux to demonstrate it here:

Please have a look at below link which I already covered briefly how to get AWS ec2 instance running and ssh to it remotely using private key:

Connecting local git repo with remote GitHub repo in AWS Linux instance – csforce.de | VIC: Setup self-managed GitLab runner on AWS ec2 cloud

First we will have to add GitLab runner repo and then yum install it:

by executing below curl piped command to add the repo & yum install command to install it at terminal:

# Add the official GitLab Runner repository
curl -L https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.rpm.sh | sudo bash

# Install the runner
sudo yum install gitlab-runner -y

after the piped command we should see this which mean our gitlab repo is added to our amazon linux distro repos:

This looks good our GitLab’s runner is now installed in our AWS ec2 cloud linux distro successfully.

Next, we need to setup our GitLab’s repo and point our runner to our AWS ec2 cloud linux runner instance.

Go to our project’s repository, then navigate to Settings → CI/CD → Runners, and disable the instance (SaaS) runners by toggling them off.

Next, go to Project Runners → Create project runner

For the simplicity, we will check untagged and click create runner

Choose Linux since our AWS cloud is EC2 Amazon linux

And copied and paste the following command at our aws ec2 cloud instance (amazon linux) to register it there (to link them: Gitlab <-> aws ec2 linux runner instance)

hit enter since we will use the default url https://gitlab.com

and enter “linuxrunner” as the name our self-managed runner here:

next type shell for the executor and hit enter (since this is a simple demonstration of the setup of self-managed runner, we chose shell for our .gitlab-ci.yml’s script aka bash, but if you want you can choose vm, or docker etc and chose by typing the executor type here)

This looks good:

  • we use https://gitlab.com as instance url,
  • named it “linuxrunner”
  • and chose shell as an executor.

At our GitLab project settings CI/CD we would see this below:

Click on View runners and we should see our project runner is registered successfully and online (green means it is online)

Note that our simple .gitlab-ci.yml is as below: 3 stages (build, test, deploy), 1 build job, 2 test jobs, and one deploy job.

so any commit to our main branch of our project branch would trigger the pipeline as seen below:

All the jobs completed and passed successfully

We can check our ci/cd pipeline jobs log as seen above and we saw that our script were executed successfully.

This concluded that we have pointed our project’s CI/CD pipeline runner to our self-managed runner on AWS EC2 cloud Amzon Linux successfully.

In addition, how do we check the status of our runner whether it is running or not?

our runner is actually a daemon. think of daemon like a windows service which we can check using service.msc but how do we check the status of daemon service in linux?

A Linux daemon service is a background process that runs continuously on a Linux system, typically without direct user interaction. Daemons are often used for system or network services — like web servers, database servers, or schedulers — and are managed using the systemd service manager in most modern Linux distributions.

in centos family like Amazon linux distribution, we can check it using

 sudo systemctl status <daemon_service_name>
e.g:
sudo systemctl status gitlab-runner

to make gitlab-runner daemon service run automatically at start up we can execute command below:

sudo systemctl enable gitlab-runner

To start/stop it:

sudo systemctl start gitlab-runner
sudo systemctl stop gitlab-runner