Wait… what ?

Tailscale is a networking thing. What are you even doing ?

Well, yes. It does grant access at the network layer which solves a problem you didn’t even know you had when backing up k0s kubernetes. It also does ssh authentication1 these days, which is cool AF and solves the second part of our problem; authenticating to backup the keys to our kingdom.

Let’s assume you’re somewhat familiar with Tailscale, and k0s/Kubernetes. If you use another Kubernetes distribution or toolkit, you’ll have to substitute your own backup tooling.

Kubernetes and k0s

We all know that k0s is our fav round these parts, and it makes life really easy as far a backups go. The same k0sctl.yaml file you use to create and manage the cluster is the same one that k0sctl can use to backup the sntire cluster state.

We use the sidecar pattern, with Tailscale networking to make all of this work.

Setup Tailscale

Tailscale ssh is pretty neat, and requires no special client software. Just tailscale and regular old openssh. It’s available on all plans except Starter.

We won’t cover setting up SSH session recording in this article but it’s useful to knows that A) it exists with Tailscale and B) you can enforce it’s use.

Tags

Tailscale network policy is mostly tag driven, and they’re super useful to associate a device to an identity, and to group together similar devices.

In our use case we’ll need two tags, one for the group of nodes that run k0s, and one to hold the identity of our backup job.

  "tagOwners": {
    "tag:k0s-backup":       [],
    "tag:k0s-node":         [],
  },

Access Controls

Now, we’ll need to allow access to port 22 on our k0s nodes at layer 4, and we’ll need to allow SSH authentication as well.

  "acls": [
    {
      "action": "accept",
      "src":    ["tag:k0s-backup"],
      "dst":    ["tag:k0s-node:22"],
    },
  ],
  "ssh": [
    {
      "action": "accept",
      "src":    ["tag:k0s-backup"],
      "dst":    ["tag:k0s-node"],
      "users":  ["autogroup:nonroot"],
    },
  ],

The power of tags makes this fairly obvious. We accept tailscale objects with a source tag of k0s-backup, and allow them to ssh into devices with the k0s-node tag as non-root2.

This is by no means a complete tailscale policy file, but should fit in nicely with any other configuration you’ve got.

Auth Keys

Head on over to your tailscale admin console, into the setting section. On the left-hand menu we want to create an API key, programmatic access into tailscale. This is what our Kubernetes CronJob will use to authenticate into our tailnet.

Hit ‘Keys’ under personal and then ‘Generate auth key…’

Name it something useful like k0s-backup, make it Reusable, set expiration to 90 days3, tick Ephemeral, tick Pre-approved, tick Tags and select the one you previously created, k0s-backup.

Make sure you copy this now, it will never be shown to you again and we need it to configure the Kubernetes side of things.

Machines

Head on over to the Machines tab now, and ensure the nodes that run k0s have the tag k0s-node applied to them.

Why Kubernetes

Everything is a nail. But really, it doesn’t have to run inside of Kubernetes, but if you’re here reading this why the heck not.

Besides, who wants to remember if the correct place is /etc/crontab, or is it /etc/cron.d.. or maybe you use /etc/cron.hourly. Or was it in the user cron file, with cron -e as user backuptailscale ?

Create a namespace

This will hold all of our config, service accounts and associated bits. Stuffing it into it’s own namespace makes it easy to collect in one place. Besides, they’re cheap, use’m!

kubectl create ns k0s-backup

k0s configuration file into a ConfigMap

Dump our k0sctl.yaml into a ConfigMap that can be consumed by our CronJob:

kind: ConfigMap
metadata:
  name: k0sctl-cfg
  namespace: k0s
apiVersion: v1
data:
  k0sctl.yaml: |
    apiVersion: k0sctl.k0sproject.io/v1beta1
    kind: Cluster
    metadata:
      name: k0s-cluster
    spec:
      hosts:
      - openSSH:
          address: ts-node1.turducken-blastiosis.ts.net
          user: core
          port: 22
        privateAddress: 192.168.1.52
        privateInterface: eno1
        role: controller+worker
        noTaints: true
      - openSSH:
          address: ts-node2.turducken-blastiosis.ts.net
          user: core
          port: 22
        privateAddress: 192.168.1.51
        privateInterface: eno1
        role: controller+worker
        noTaints: true
      - openSSH:
          address: ts-node2.turducken-blastiosis.ts.net
          user: core
          port: 22
        privateAddress: 192.168.1.50
        privateInterface: eno1
        role: controller+worker
        noTaints: true
      k0s:
        versionChannel: stable
        dynamicConfig: false
        config:
          apiVersion: k0s.k0sproject.io/v1beta1
          kind: Cluster
          metadata:
            name: k0s
          spec:
            api:
              sans:
                - cluster.hostname
              k0sApiPort: 9443
              port: 6443
            installConfig:
              users:
                etcdUser: etcd
                kineUser: kube-apiserver
                konnectivityUser: konnectivity-server
                kubeAPIserverUser: kube-apiserver
                kubeSchedulerUser: kube-scheduler
            konnectivity:
              adminPort: 8133
              agentPort: 8132
            network:
              dualStack:
                enabled: true
                IPv6podCIDR: "fc00::/108"
                IPv6serviceCIDR: "fc00::/108"
              kubeProxy:
                disabled: true
              provider: custom
            podSecurityPolicy:
              defaultPolicy: 00-k0s-privileged
            storage:
              type: etcd
            telemetry:
              enabled: false    

Most of that is just boiler-plate k0s config which comes from our regular cluster config, but two major things to note.

The address we’re using for each is the tailscale hostname, meaning we will connect over the tailnet. You’ll need to make sure this is correct for your setup.

The user we login as, which we configured on the Tailscale side in 2

API keys, Secrets, Roles, and ServiceAccounts

Create a secret to hold the API key we create previously4:

kubectl create secret generic k0s-tailscale-token \
    --from-literal=TS_AUTHKEY=tskey-auth-kmAYcAw1SA21CNTRL-b3F7gfCEnCh4Wi4nSM3RCh3A4LVudTyy

Now create a role, rolebinding and serviceaccount that we can use to fetch our secret data from a cronjob later:

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: tailscale
  namespace: k0s
rules:
- apiGroups: [""] # "" indicates the core API group
  resources: ["secrets"]
  # Create can not be restricted to a resource name.
  verbs: ["create"]
- apiGroups: [""] # "" indicates the core API group
  resourceNames: ["tailscale-auth-k0s"]
  resources: ["secrets"]
  verbs: ["get", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: tailscale
  namespace: k0s
subjects:
- kind: ServiceAccount
  name: "tailscale"
roleRef:
  kind: Role
  name: tailscale
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: tailscale
  namespace: k0s

Where to store the backup

You can use any storage available to your k8s cluster, but please choose carefully and dump it to s3, or off-site NFS or similar.

CronJob

This is where we pull it all together:

  • Instantiating a container, and sidecar on a schedule.
  • Install openssh, and k0sctl
  • Mount our k0sctl.yaml inside the container via a ConfigMap
  • Mount an NFS share to store the backups
  • Using the API key to authenticate to the tailnet
  • ssh into each k0s node authenticated by Tailscale and copy the kubernetes data back over that channel
  • Post to a health check service to let us know it all worked
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: k0sctl-snapshot-cronjob
  namespace: k0s
spec:
  schedule: "@hourly"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: "tailscale"
          restartPolicy: Never
          volumes:
          - name: share
            emptyDir: {}
          - name: nfs-backup
            nfs:
              server: 192.168.1.35
              path: /share/NFSv4=4/backups
          - name: k0sctl-cfg
            configMap:
              name: k0sctl-cfg # Name of the configMap
          initContainers:
          - name: ts-sidecar
            restartPolicy: Always
            imagePullPolicy: Always
            image: "ghcr.io/tailscale/tailscale:latest"
            envFrom:
            - secretRef:
                name: k0s-tailscale-token
            env:
            - name: TS_KUBE_SECRET
              value: "tailscale-auth-k0s"
            - name: TS_USERSPACE
              value: "false"
            - name: TS_ACCEPT_DNS
              value: "true"
            securityContext:
              capabilities:
                add:
                - NET_ADMIN
          containers:
          - name: snapshot
            image: ubuntu
            imagePullPolicy: IfNotPresent
            env:
            - name: TS_ACCEPT_DNS
              value: "true"
            command:
            - /bin/sh
            args:
            - -ec
            - |
              apt-get update && apt-get -fy install wget openssh-client
              wget -O /tmp/k0sctl https://github.com/k0sproject/k0sctl/releases/download/v0.17.8/k0sctl-linux-x64
              chmod +x /tmp/k0sctl
              cd /backup/k0s
              DISABLE_TELEMETRY=true /tmp/k0sctl backup -c /k0sctl.yaml
              # f successful, post to our Uptime instance
              if [ $? -eq "0" ]; then
                wget -O- "https://uptime.svc/api/push/n31a?status=up&msg=OK&ping="
              fi
              # Delete any copies older than 14 days
              find /backup/k0s -mtime +14 -delete              
            volumeMounts:
            - mountPath: /share
              name: share
            - mountPath: /backup
              name: nfs-backup
            - name: k0sctl-cfg
              mountPath: /k0sctl.yaml
              subPath: k0sctl.yaml

Finis.

This should get us an off-cluster copy of our Kubernetes state every hour.

It should also demo the power of the sidecar design pattern, and how to abuse use tailscale to get things done.

If you have something you need authenticated via ssh, and made network accessible in a safe fashion you now have everything you need to do from within a Kubernetes cluster.


  1. https://tailscale.com/blog/tailscale-ssh-ga ↩︎

  2. We use the user ‘core’ in our setup, a generic non-root user with passwordless sudo capability. ↩︎ ↩︎

  3. You’ll need to set some sort of reminder for this otherwise it will start failing in 90 days. ↩︎

  4. for the love of all things holy, please manage your secrets better than this ↩︎