This project is no longer actively developed or maintained. We recommend switching to the new project lazy-gitlab-runner-k8s-tf, which includes improvements and new features:
terraform-kubernetes-gitlab-runner is originated on DeimosCloud/terraform-kubernetes-gitlab-runner, but the forked repository had diverged significantly from its parent, so I've decided to detach it.
- add static typing for azure, and s3
cache
variable
Setup Gitlab Runner on cluster using terraform. The runner is installed via the Gitlab Runner Helm Chart
Ensure Kubernetes Provider and Helm Provider settings are correct https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/guides/getting-started#provider-setup
module "gitlab_runner" {
source = "DeimosCloud/gitlab-runner/kubernetes"
release_name = "${var.project_name}-runner-${var.environment}"
runner_tags = var.runner_tags
runner_registration_token = var.runner_registration_token
runner_image = var.runner_image
namespace = var.gitlab_runner_namespace
# change runner's default image registry settings
image = {
registry = "nexus.my.domain"
repository = "gitlab/gitlab-runner"
tag = "alpine3.18"
}
// set default shell
shell = "bash"
// increase log limit for verbose jobs
output_limit = 26000
rbac {
create = true
# Pass annotations to service account, e.g. in this case GCP Workload Identity needs it
service_account_annotations = {
"iam.gke.io/gcp-service-account" = module.workload_identity["gitlab-runner"].gcp_service_account_email
}
rules = [
{
resources = ["configmaps", "pods", "pods/attach", "secrets", "services"]
verbs : ["get", "list", "watch", "create", "patch", "update", "delete"]
},
{
api_groups = [""]
resources = ["pods/exec"]
verbs = ["create", "patch", "delete"]
}
]
}
// job pods will be scheduled with these resource requests/limits
job_resources = {
cpu_request = "100m"
cpu_request_overwrite_max_allowed = "2000m"
memory_request = "1Gi"
memory_request_overwrite_max_allowed = "2Gi"
cpu_limit_overwrite_max_allowed = "2000m"
memory_limit_overwrite_max_allowed = "2Gi"
helper_cpu_request = "200m"
helper_memory_request = "256Mi"
service_cpu_request = "1000m"
service_memory_request = "1Gi"
service_cpu_limit = "3000m"
service_memory_limit = "2Gi"
}
# runner resources requests/limits
resources = {
requests = {
memory = "128Mi"
cpu = "100m"
}
}
# enable prometheus metrics
metrics = {
enabled = true
}
# add this labels to every job's pod
job_pod_labels = {
jobId = "$CI_JOB_ID"
pipelineId = "$CI_PIPELINE_ID"
gitUserLogin = "$GITLAB_USER_LOGIN"
project = "$CI_PROJECT_NAME"
}
# cache settings
cache = {
type = "gcs"
path = "cache"
shared = true
secret_name = kubernetes_secret.gcscred.metadata[0].name
gcs = {
bucket_name = module.gcs.bucket_name
}
}
# docker-in-docker cert settings
build_job_empty_dirs = {
"docker-certs" = {
mount_path = "/certs/client"
medium = "Memory"
}
}
# simple runner and job environment variables setting, e.g. HTTPS_PROXY
envs = [
{
name = "HTTPS_PROXY"
value = "http://proxy.net.int:3128"
job = true #job container sees that variable
runner = true #runner also sees that var
},
{
name = "FOO"
value = "bar"
job = true
runner = false #only job needs this env variable
}
]
}
To pass in custom values use the var.values
input which specifies a map of values in terraform map format
or var.values_file
which specifies a path containing a valid yaml values file to pass to the Chart
There are a few capabilities enabled by using hostmounted directories; let's look at a few examples of config and what they effectively configure in the config.toml.
The most common hostmount is simply sharing the docker socket to allow the build container to start new containers, but
avoid a Docker-in-Docker config. This is useful to take an unmodified docker build ...
command from your current build
process, and copy it to a .gitlab-ci.yaml or a github action. To map your docker socket from the docker host to the
container, you need to (as above):
module "gitlab_runner" {
...
build_job_mount_docker_socket = true
...
}
This causes the config.toml to create two sections:
[runners.kubernetes.pod_security_context]
...
fs_group = ${var.docker_fs_group}
...
This first section defines a Kubernetes Pod Security Policy that causes the mounted filesystem to have the group value
overwritten to this GID (
see https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod).
Combined with a runAsGroup
in Kubernetes, this would ensure that the files in the container are writeable by the
process running as a defined GID.
Additionally, the config.toml gains a host_path config:
[runners.kubernetes.volumes]
...
[[runners.kubernetes.volumes.host_path]]
name = "docker-socket"
mount_path = "/var/run/docker.sock"
read_only = true
host_path = "/var/run/docker.sock"
...
This causes the /var/run/docker.sock
"volume" (really a Unix-Domain Socket) at the default path to communicate with
the docker engine to bemounted in the same location inside the container. The mount is marked "read_only" because the
filesystem cannot have filesystem objects added to it (you're not going to add file or directories to the socket UDS)
but the docker command can still write to the socket to send commands.
The standard way of collecting metrics from a Gitlab-Runner is to enable the Prometheus endpoint, and subscribe it for scraping; what if you want to send events? Statsd has a timer (the Datadog implementation does not), but you can expose the UDS from statsd or dogstatsd to allow simple netcat-based submisison of build events and timings if desired.
For example, the standard UDS for Dogstatsd, the Datadog mostly-drop-in replacement for statsd, is
at /var/run/datadog/dsd.socket
. I chose to share that entire directory into the build container as follows:
module "gitlab_runner" {
...
build_job_hostmounts = {
dogstatsd = { host_path = "/var/run/datadog" }
}
...
}
You may notice that I haven't set a container_path
to define the mount_path
in the container at which the host's
volume should be mounted. If it's not defined, container_path
defaults to the host_path
.
This causes the config.toml to create a host_path section:
[runners.kubernetes.volumes]
...
[[runners.kubernetes.volumes.host_path]]
name = "dogstatsd"
mount_path = "/var/run/datadog"
read_only = false
host_path = "/var/run/datadog"
...
This allows the basic submission of custom metrics to be done with, say, netcat as per the Datadog instructions (https://docs.datadoghq.com/developers/dogstatsd/unix_socket/?tab=host#test-with-netcat):
echo -n "custom.metric.name:1|c" | nc -U -u -w1 /var/run/datadog/dsd.socket
If I wanted a non-standard path inside the container (so that, say, some rogue process doesn't automatically log to a socket if it's present in the default location) re can remap the UDS in the contrived example that follows.
As noted above, a contrived example of mounting in a different path might be some corporate service/daemon in a container that automatically tries to submit metrics if it sees the socket in the filesystem. Lacking the source or permission to change that automation, but wanting to use the UDS ourselves to sink metrics, we can mount it at a different nonstandard location.
This isn't the best example, but there are stranger things in corporate software than I can dream up.
In order to make this UDS appear at a different location, you could do the following. Note that you might want to refer to the actual socket rather than the containing directory as the docker.sock is done above. That likely makes more sense, but to keep parallelism with the dogstatsd example that I (chickenandpork) am using daily, let's map the containing directory: let's map /var/run/datadog/ in the host to the container's /var/run/metrics/ path:
module "gitlab_runner" {
...
build_job_hostmounts = {
dogstatsd = {
host_path = "/var/run/datadog"
container_path = "/var/run/metrics"
}
}
...
}
This causes the config.toml to create a host_path section:
[runners.kubernetes.volumes]
...
[[runners.kubernetes.volumes.host_path]]
name = "dogstatsd"
mount_path = "/var/run/metrics"
read_only = false
host_path = "/var/run/datadog"
...
The result is that the Unix -Domain Socket is available at a non-standard location that our custom tools can use, but anything looking in the conventional default location won't see anything.
Although you'd likely use a proper binary to sink metrics in production, you can manually log metrics to test inside the container (or in your build script) using:
echo -n "custom.metric.name:1|c" | nc -U -u -w1 /var/run/metrics/dsd.socket
In production, you'd likely also make this read_only
, use filesystem permissions to guard access, and likely too
simply configure or improve the errant software, but there are strange side-effects and constraints of long-lived
software in industry.
If you're using the TLS docker connection to do docker builds in your CI, and you don't set an empty TLS_CERTS directory, then the docker engine recently defaults to creating certificates, and requiring TLS. In order to have these certificates available to your build-container's docker command, you may need to share that certificate directory back into the buid container.
This can be done with:
module "gitlab_runner" {
#...
build_job_empty_dirs = {
"docker-certs" = {
mount_path = "/certs/client"
medium = "Memory"
}
}
#...
}
This causes the config.toml to create a host_path section:
[runners.kubernetes.volumes]
...
[[runners.kubernetes.volumes.empty_dir]]
name = "docker-certs
mount_path = "/certs/client
medium = "Memory"
...
In your build, you may need to define the enviroment variables:
DOCKER_TLS_CERTDIR: /certs
,DOCKER_TLS_VERIFY: 1
DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client"
The docker CLI should use the TLS tcp/2376 port if it sees aDOCKER_TLS_CERTDIR
, but if not,--host
argument orDOCKER_HOST=tcp://hostname:2376/
are some options to steer it to the correct port/protocol.
Report issues/questions/feature requests on in the issues section.
Full contributing guidelines are covered here.
Name | Version |
---|---|
terraform | >= 1.3.3 |
helm | >= 2.10 |
kubernetes | >= 2.22 |
Name | Version |
---|---|
helm | 2.10.1 |
Name | Type |
---|---|
helm_release.gitlab_runner | resource |
Name | Description | Type | Default | Required |
---|---|---|---|---|
affinity | Affinity for pod assignment. | object({ |
{} |
no |
atomic | whether to deploy the entire module as a unit | bool |
true |
no |
build_job_default_container_image | Default container image to use for builds when none is specified | string |
"ubuntu:20.04" |
no |
build_job_empty_dirs | A map of name:{mount_path, medium} for which each named value will result in a named empty_dir mounted with. | map(object({ |
{} |
no |
build_job_hostmounts | A list of maps of name:{host_path, container_path, read_only} for which each named value will result in a hostmount of the host path to the container at container_path. If not given, container_path fallsback to host_path: dogstatsd = { host_path = '/var/run/dogstatsd' } will mount the host /var/run/dogstatsd to the same path in container. | map(map(any)) |
{} |
no |
build_job_mount_docker_socket | Path on nodes for caching | bool |
false |
no |
build_job_privileged | Run all containers with the privileged flag enabled. This will allow the docker:dind image to run if you need to run Docker | bool |
false |
no |
build_job_run_container_as_user | SecurityContext: runAsUser for all running job pods | string |
null |
no |
build_job_secret_volumes | Secret volume configuration instructs Kubernetes to use a secret that is defined in Kubernetes cluster and mount it inside of the containes as defined https://docs.gitlab.com/runner/executors/kubernetes.html#secret-volumes | object({ |
{ |
no |
cache | Describes the properties of the cache. type can be either of ['local', 'gcs', 's3', 'azure'], path defines a path to append to the bucket url, shared specifies whether the cache can be shared between runners. you also specify the individual properties of the particular cache type you select. see https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runnerscache-section | object({ |
null |
no |
chart_version | The version of the chart | string |
"0.40.1" |
no |
check_interval | Defines in seconds how often to check GitLab for a new builds. | number |
30 |
no |
concurrent | Configure the maximum number of concurrent jobs | number |
10 |
no |
config_maps | Additional map merged with the default runner ConfigMap. | map(string) |
null |
no |
create_namespace | (Optional) Create the namespace if it does not yet exist. Defaults to false. | bool |
true |
no |
docker_fs_group | The fsGroup to use for docker. This is added to security context when mount_docker_socket is enabled | number |
412 |
no |
envs | Environment variable to be set for either runner or job or both. | list(object({ |
[] |
no |
gitlab_url | The GitLab Server URL (with protocol) that want to register the runner against | string |
"https://gitlab.com/" |
no |
health_check | Health check options for the runner to check it's health. Supports only timeoutSeconds. | object({ |
{} |
no |
helper_job_container_image | Helper container image. | string |
null |
no |
host_aliases | List of hosts and IPs that will be injected into the pod's hosts file. | list(object({ |
[] |
no |
hpa | Horizontal Pod Autoscaling with API limited to metrics specification only (api/version: autoscaling/v2). | object({ |
null |
no |
image | The docker gitlab runner image. | object({ |
{} |
no |
image_pull_policy | Specify the job images pull policy: Never, IfNotPresent, Always. | string |
"IfNotPresent" |
no |
image_pull_secrets | A array of secrets that are used to authenticate Docker image pulling. | list(string) |
[] |
no |
job_affinity | Specify affinity rules that determine which node runs the job. No HCL support for this variable. Use string interpolation if needed. | string |
"" |
no |
job_identity | Default service account job pods use to talk to Kubernetes API. | object({ |
{} |
no |
job_pod_annotations | A map of annotations to be added to each build pod created by the Runner. The value of these can include environment variables for expansion. Pod annotations can be overwritten in each build. | map(string) |
{} |
no |
job_pod_labels | A map of labels to be added to each build pod created by the runner. The value of these can include environment variables for expansion. | map(string) |
{} |
no |
job_pod_node_selectors | A map of node selectors to apply to the pods | map(string) |
{} |
no |
job_pod_node_tolerations | A map of node tolerations to apply to the pods as defined https://docs.gitlab.com/runner/executors/kubernetes.html#other-configtoml-settings | map(string) |
{} |
no |
job_resources | The CPU and memory resources given to service containers. | object({ |
{} |
no |
local_cache_dir | Path on nodes for caching | string |
"/tmp/gitlab/cache" |
no |
log_level | Configure GitLab Runner's logging level. Available values are: debug, info, warn, error, fatal, panic. | string |
"info" |
no |
metrics | Configure integrated Prometheus metrics exporter. | object({ |
{} |
no |
namespace | n/a | string |
"gitlab-runner" |
no |
node_selector | A map of node selectors to apply to the pods | map(string) |
{} |
no |
output_limit | Maximum build log size in kilobytes. Default is 4096 (4MB). | number |
null |
no |
pod_annotations | A map of annotations to be added to each build pod created by the Runner. The value of these can include environment variables for expansion. Pod annotations can be overwritten in each build. | map(string) |
{} |
no |
pod_labels | A map of labels to be added to each build pod created by the runner. The value of these can include environment variables for expansion. | map(string) |
{} |
no |
pod_security_context | Runner POD security context. | object({ |
{} |
no |
poll | Polling options for the runner to poll it's job pods. | object({ |
{} |
no |
priority_class_name | Configure priorityClassName for the runner pod. If not set, globalDefault priority class is used. | string |
"" |
no |
pull_policy | Specify the job images pull policy: never, if-not-present, always. | set(string) |
[ |
no |
rbac | RBAC support. | object({ |
{} |
no |
release_name | The helm release name | string |
"gitlab-runner" |
no |
replicas | The number of runner pods to create. | number |
1 |
no |
resources | The CPU and memory resources given to the runner. | object({ |
null |
no |
run_untagged_jobs | Specify if jobs without tags should be run. https://docs.gitlab.com/ce/ci/runners/#runner-is-allowed-to-run-untagged-jobs | bool |
false |
no |
runner_locked | Specify whether the runner should be locked to a specific project/group | string |
true |
no |
runner_name | The runner's description. | string |
n/a | yes |
runner_registration_token | runner registration token | string |
n/a | yes |
runner_tags | Specify the tags associated with the runner. Comma-separated list of tags. | string |
n/a | yes |
runner_token | token of already registered runer. to use this var.runner_registration_token must be set to null | string |
null |
no |
secrets | Secrets to mount into the runner pods | list(map(string)) |
[] |
no |
security_context | Runner container security context. | object({ |
{} |
no |
service | Configure a service resource e.g., to allow scraping metrics via prometheus-operator serviceMonitor. | object({ |
{} |
no |
service_account | The name of the k8s service account to create (since 17.x.x) | object({ |
n/a | yes |
shell | Name of shell to generate the script. | string |
null |
no |
shutdown_timeout | Number of seconds until the forceful shutdown operation times out and exits the process. The default value is 30. If set to 0 or lower, the default value is used. | number |
0 |
no |
termination_grace_period_seconds | When stopping the runner, give it time (in seconds) to wait for its jobs to terminate. | number |
3600 |
no |
tolerations | A map of node tolerations to apply to the pods as defined https://docs.gitlab.com/runner/executors/kubernetes.html#other-configtoml-settings | list(object({ |
null |
no |
topology_spread_constraints | TopologySpreadConstraints for pod assignment. | list(object({ |
null |
no |
unhealthy_interval | Duration that a runner worker is disabled for after it exceeds the unhealthy requests limit. Supports syntax like ‘3600s’, ‘1h30min’ etc. | string |
"120s" |
no |
unhealthy_requests_limit | The number of unhealthy responses to new job requests after which a runner worker will be disabled. | number |
30 |
no |
unregister_runners | whether runners should be unregistered when pool is deprovisioned | bool |
true |
no |
values | Additional values to be passed to the gitlab-runner helm chart | map(any) |
{} |
no |
values_file | Path to Values file to be passed to gitlab-runner helm chart | string |
null |
no |
volume_mounts | Additional volumeMounts to add to the runner container. | list(object({ |
[] |
no |
volumes | Additional volumes to add to the runner pod. No HCL support here yet. Please use camel case for this variable. | list(any) |
[] |
no |
Name | Description |
---|---|
chart_version | The chart version |
namespace | The namespace gitlab-runner was deployed in |
release_name | The helm release name |