Octopus Deploy self-hosted worker in Kubernetes

Becasue Octopus Deploy workers have no static IP addresses we are enforced to host our own workers, which has some benefits (worker from inside Kubernetes sees all services and may be used to run some integration stuff)

Creating workers on a Kubernetes cluster describes over all process of creating workers in Kuberntes

Docker image

Original image has no additional tools

So we are building our own

FROM octopusdeploy/tentacle:6.2.218

# Set to Y to indicate that you accept the EULA
ENV ACCEPT_EULA=Y
# The port on the Octopus Server that the Tentacle will poll for work. Defaults to 10943. Implies a polling Tentacle
ENV ServerPort="10943"
# The Url of the Octopus Server the Tentacle should register with
ENV ServerUrl="https://contoso.octopus.app"
# The name of the space which the Tentacle will be added to. Defaults to the default space
ENV Space="RUA"

# utils
RUN apt-get update \
    && apt-get install -y \
        curl \
        wget \
        ca-certificates \
        jq \
        sudo \
    && echo done

# powershell - https://learn.microsoft.com/en-us/powershell/scripting/install/install-debian?view=powershell-7.2#installation-on-debian-10-via-package-repository
RUN wget https://packages.microsoft.com/config/debian/10/packages-microsoft-prod.deb \
    && sudo dpkg -i packages-microsoft-prod.deb \
    && sudo apt-get update \
    && sudo apt-get install -y powershell \
    && rm packages-microsoft-prod.deb

# kubectl - https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/#install-using-native-package-management
RUN curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg \
    && echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | tee /etc/apt/sources.list.d/kubernetes.list \
    && apt-get update \
    && apt-get install -y kubectl


# helm - https://helm.sh/docs/intro/install/#from-apt-debianubuntu
RUN curl https://baltocdn.com/helm/signing.asc | gpg --dearmor | sudo tee /usr/share/keyrings/helm.gpg > /dev/null \
    && apt-get install apt-transport-https --yes \
    && echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" | tee /etc/apt/sources.list.d/helm-stable-debian.list \
    && apt-get update \
    && apt-get install helm

Kubernetes

Because we preventing access to kubernetes api from outside we have "chicken vs egg" problem, aka we deploy everything via Octopus Deploy, but without worker being deployed we can not deploy anything, thats why worker itself should be deployed somehow outside of Octopus

Note about stateful set

Because removed pods are not removed from Octopus Deploy automatically it will become mess in future

That's why we are deploying stateful set instead of deployment so naming will be always the same

Note that there are examples with pre-stop lifecycle hooks which also may be used to solve this, but they not guarantied to run and add unwanted complexity

Here is an example of manifest used:

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: octoworker
  namespace: production
  labels:
    app: octoworker
spec:
  serviceName: octoworker
  replicas: 1
  revisionHistoryLimit: 1
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      app: octoworker
  template:
    metadata:
      labels:
        app: octoworker
    spec:
      nodeSelector:
        kubernetes.io/os: linux
      containers:
      - name: octoworker
        image: gcr.io/majestic-cairn-171208/octoworker:latest
        imagePullPolicy: IfNotPresent
        env:
        - name: ServerApiKey
          value: "API-XXXXXXXXXXX"
        - name: TargetWorkerPool
          value: "aks"
        resources:
          requests:
            cpu: 50m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 2048Mi
        securityContext:
          privileged: true

Octopus Deploy

From Octopus Deploy side we should switch from "Dynamic worker pool" to created one

In case if there are multiple environments and cluster it may be a good idea to create dedicated common variable, which may hold worker pool name dependant on environment

Migration

So now everything is almost done, but there is an problem, you may have 100500 projects in Octopus Deploy which needs to be changed

Here is an sample script that may be used as an starting point to do so:

$headers = @{'X-Octopus-ApiKey' = $env:OCTOPUS_CLI_API_KEY }
$projects = Invoke-RestMethod -Uri "https://contoso.octopus.app/api/projects?skip=0&take=1000" -Headers $headers | Select-Object -ExpandProperty Items
Write-Host "Got $($projects.Count) projects"

foreach ($project in $projects) {
  # $project = $projects |? Name -eq 'my-awesome-app'
  Write-Host '----------------------------------'
  Write-Host $project.Name
  if ($project.IsDisabled) {
    Write-Host "skipping disabled project..." -ForegroundColor Yellow
    continue
  }
  $process = Invoke-RestMethod -Uri "https://contoso.octopus.app$($project.Links.DeploymentProcess)" -Headers $headers
  $changed = $false
  foreach ($step in $process.Steps) {
    # $process.Steps | Select-Object name
    # $step = $process.Steps[0]
    if (-not ($step.Actions | Where-Object IsDisabled -EQ $false)) {
      Write-Host "skipping '$($step.name)' actions disabled..." -ForegroundColor Yellow
      continue
    }
    if ($step.Properties.'Octopus.Action.TargetRoles' -ne 'kube-azure') {
      Write-Host "skipping '$($step.Name)' has '$($step.Properties.'Octopus.Action.TargetRoles')' taget role instead of expected 'kube-azure'..." -ForegroundColor Yellow
      continue
    }
    foreach ($action in $step.Actions) {
      # $step.Actions | Select-Object name
      # $action = $step.Actions[0]
      if ($action.IsDisabled) {
        Write-Host "skipping disabled action..." -ForegroundColor Yellow
        continue
      }
      $actionTypes = @(
        'Octopus.KubernetesDeployContainers',
        'Octopus.KubernetesDeployRawYaml',
        'Octopus.KubernetesDeployService',
        'Octopus.KubernetesDeploySecret',
        'Octopus.KubernetesDeployIngress',
        'Octopus.KubernetesDeployConfigMap'
      )
      if ($action.ActionType -notin $actionTypes) {
        Write-Host "skipping '$($action.ActionType)' non kubernetes action..." -ForegroundColor Yellow
        continue
      }
      # raw yaml deployments has no such property but in our case it does not matter
      # if ($action.Properties.'Octopus.Action.KubernetesContainers.DeploymentResourceType' -notin @('Deployment', 'StatefulSet')) {
      #   Write-Host "skipping '$($action.Properties.'Octopus.Action.KubernetesContainers.DeploymentResourceType')' non deployment action..." -ForegroundColor Yellow
      #   continue
      # }
      if ($action.WorkerPoolId) {
        Write-Host "skipping '$($action.WorkerPoolId)' action has non default worker pool..." -ForegroundColor Yellow
        continue
      }


      <#
      $action.WorkerPoolVariable = 'octoworker' - will switch to our self hosted workers
      $action.WorkerPoolVariable = $null - will switch to default octopus workers
      #>

      if ($action.WorkerPoolVariable -eq 'octoworker') {
        Write-Host "skipping already has octoworker variable..." -ForegroundColor Yellow
        continue
      }
      $action.WorkerPoolVariable = 'octoworker'

      # if ($action.WorkerPoolVariable -ne 'octoworker') {
      #   Write-Host "skipping already has octoworker variable..." -ForegroundColor Yellow
      #   continue
      # }
      # $action.WorkerPoolVariable = $null



      $changed = $true
    }
  }

  if ($changed) {
    try {
      Invoke-RestMethod -Method Put "https://contoso.octopus.app$($project.Links.DeploymentProcess)" -Headers $headers -Body ($process | ConvertTo-Json -Depth 100) | Out-Null
      Write-Host "$($project.Name) - success" -ForegroundColor Green
    }
    catch {
      Write-Host "$($project.Name) - failed" -ForegroundColor Red
    }
  }
}

Followup steps

  • tune resource requests and limits
  • check if we can have probes, at least liveness
  • consider dedicated node pool
  • rover subgraph checks may now access all containers in environment
  • consider priviledged service account so can run some scripts