7 minutes
Authenticating Kubernetes backups via.. Tailscale
Wait… what ?
Tailscale is a networking thing. What are you even doing ?
Well, yes. It does grant access at the network layer which solves a problem you didn’t even know you had when backing up k0s kubernetes. It also does ssh authentication1 these days, which is cool AF and solves the second part of our problem; authenticating to backup the keys to our kingdom.
Let’s assume you’re somewhat familiar with Tailscale, and k0s/Kubernetes. If you use another Kubernetes distribution or toolkit, you’ll have to substitute your own backup tooling.
Kubernetes and k0s
We all know that k0s is our fav round these parts, and it makes life really easy as far a backups go. The same k0sctl.yaml file you use to create and manage the cluster is the same one that k0sctl can use to backup the sntire cluster state.
We use the sidecar pattern, with Tailscale networking to make all of this work.
Setup Tailscale
Tailscale ssh is pretty neat, and requires no special client software. Just tailscale and regular old openssh. It’s available on all plans except Starter.
We won’t cover setting up SSH session recording in this article but it’s useful to knows that A) it exists with Tailscale and B) you can enforce it’s use.
Tags
Tailscale network policy is mostly tag driven, and they’re super useful to associate a device to an identity, and to group together similar devices.
In our use case we’ll need two tags, one for the group of nodes that run k0s, and one to hold the identity of our backup job.
"tagOwners": {
"tag:k0s-backup": [],
"tag:k0s-node": [],
},
Access Controls
Now, we’ll need to allow access to port 22 on our k0s nodes at layer 4, and we’ll need to allow SSH authentication as well.
"acls": [
{
"action": "accept",
"src": ["tag:k0s-backup"],
"dst": ["tag:k0s-node:22"],
},
],
"ssh": [
{
"action": "accept",
"src": ["tag:k0s-backup"],
"dst": ["tag:k0s-node"],
"users": ["autogroup:nonroot"],
},
],
The power of tags makes this fairly obvious. We accept tailscale objects with a source tag of k0s-backup, and allow them to ssh into devices with the k0s-node tag as non-root2.
This is by no means a complete tailscale policy file, but should fit in nicely with any other configuration you’ve got.
Auth Keys
Head on over to your tailscale admin console, into the setting section. On the left-hand menu we want to create an API key, programmatic access into tailscale. This is what our Kubernetes CronJob will use to authenticate into our tailnet.
Hit ‘Keys’ under personal and then ‘Generate auth key…’
Name it something useful like k0s-backup, make it Reusable, set expiration to 90 days3, tick Ephemeral, tick Pre-approved, tick Tags and select the one you previously created, k0s-backup.
Make sure you copy this now, it will never be shown to you again and we need it to configure the Kubernetes side of things.
Machines
Head on over to the Machines tab now, and ensure the nodes that run k0s have the tag k0s-node applied to them.
Why Kubernetes
Everything is a nail. But really, it doesn’t have to run inside of Kubernetes, but if you’re here reading this why the heck not.
Besides, who wants to remember if the correct place is /etc/crontab, or is it /etc/cron.d.. or maybe you use /etc/cron.hourly. Or was it in the user cron file, with cron -e as user backuptailscale ?
Create a namespace
This will hold all of our config, service accounts and associated bits. Stuffing it into it’s own namespace makes it easy to collect in one place. Besides, they’re cheap, use’m!
kubectl create ns k0s-backup
k0s configuration file into a ConfigMap
Dump our k0sctl.yaml into a ConfigMap that can be consumed by our CronJob:
kind: ConfigMap
metadata:
name: k0sctl-cfg
namespace: k0s
apiVersion: v1
data:
k0sctl.yaml: |
apiVersion: k0sctl.k0sproject.io/v1beta1
kind: Cluster
metadata:
name: k0s-cluster
spec:
hosts:
- openSSH:
address: ts-node1.turducken-blastiosis.ts.net
user: core
port: 22
privateAddress: 192.168.1.52
privateInterface: eno1
role: controller+worker
noTaints: true
- openSSH:
address: ts-node2.turducken-blastiosis.ts.net
user: core
port: 22
privateAddress: 192.168.1.51
privateInterface: eno1
role: controller+worker
noTaints: true
- openSSH:
address: ts-node2.turducken-blastiosis.ts.net
user: core
port: 22
privateAddress: 192.168.1.50
privateInterface: eno1
role: controller+worker
noTaints: true
k0s:
versionChannel: stable
dynamicConfig: false
config:
apiVersion: k0s.k0sproject.io/v1beta1
kind: Cluster
metadata:
name: k0s
spec:
api:
sans:
- cluster.hostname
k0sApiPort: 9443
port: 6443
installConfig:
users:
etcdUser: etcd
kineUser: kube-apiserver
konnectivityUser: konnectivity-server
kubeAPIserverUser: kube-apiserver
kubeSchedulerUser: kube-scheduler
konnectivity:
adminPort: 8133
agentPort: 8132
network:
dualStack:
enabled: true
IPv6podCIDR: "fc00::/108"
IPv6serviceCIDR: "fc00::/108"
kubeProxy:
disabled: true
provider: custom
podSecurityPolicy:
defaultPolicy: 00-k0s-privileged
storage:
type: etcd
telemetry:
enabled: false
Most of that is just boiler-plate k0s config which comes from our regular cluster config, but two major things to note.
The address we’re using for each is the tailscale hostname, meaning we will connect over the tailnet. You’ll need to make sure this is correct for your setup.
The user we login as, which we configured on the Tailscale side in 2
API keys, Secrets, Roles, and ServiceAccounts
Create a secret to hold the API key we create previously4:
kubectl create secret generic k0s-tailscale-token \
--from-literal=TS_AUTHKEY=tskey-auth-kmAYcAw1SA21CNTRL-b3F7gfCEnCh4Wi4nSM3RCh3A4LVudTyy
Now create a role, rolebinding and serviceaccount that we can use to fetch our secret data from a cronjob later:
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: tailscale
namespace: k0s
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["secrets"]
# Create can not be restricted to a resource name.
verbs: ["create"]
- apiGroups: [""] # "" indicates the core API group
resourceNames: ["tailscale-auth-k0s"]
resources: ["secrets"]
verbs: ["get", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: tailscale
namespace: k0s
subjects:
- kind: ServiceAccount
name: "tailscale"
roleRef:
kind: Role
name: tailscale
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: tailscale
namespace: k0s
Where to store the backup
You can use any storage available to your k8s cluster, but please choose carefully and dump it to s3, or off-site NFS or similar.
CronJob
This is where we pull it all together:
- Instantiating a container, and sidecar on a schedule.
- Install openssh, and k0sctl
- Mount our k0sctl.yaml inside the container via a ConfigMap
- Mount an NFS share to store the backups
- Using the API key to authenticate to the tailnet
- ssh into each k0s node authenticated by Tailscale and copy the kubernetes data back over that channel
- Post to a health check service to let us know it all worked
---
apiVersion: batch/v1
kind: CronJob
metadata:
name: k0sctl-snapshot-cronjob
namespace: k0s
spec:
schedule: "@hourly"
jobTemplate:
spec:
template:
spec:
serviceAccountName: "tailscale"
restartPolicy: Never
volumes:
- name: share
emptyDir: {}
- name: nfs-backup
nfs:
server: 192.168.1.35
path: /share/NFSv4=4/backups
- name: k0sctl-cfg
configMap:
name: k0sctl-cfg # Name of the configMap
initContainers:
- name: ts-sidecar
restartPolicy: Always
imagePullPolicy: Always
image: "ghcr.io/tailscale/tailscale:latest"
envFrom:
- secretRef:
name: k0s-tailscale-token
env:
- name: TS_KUBE_SECRET
value: "tailscale-auth-k0s"
- name: TS_USERSPACE
value: "false"
- name: TS_ACCEPT_DNS
value: "true"
securityContext:
capabilities:
add:
- NET_ADMIN
containers:
- name: snapshot
image: ubuntu
imagePullPolicy: IfNotPresent
env:
- name: TS_ACCEPT_DNS
value: "true"
command:
- /bin/sh
args:
- -ec
- |
apt-get update && apt-get -fy install wget openssh-client
wget -O /tmp/k0sctl https://github.com/k0sproject/k0sctl/releases/download/v0.17.8/k0sctl-linux-x64
chmod +x /tmp/k0sctl
cd /backup/k0s
DISABLE_TELEMETRY=true /tmp/k0sctl backup -c /k0sctl.yaml
# f successful, post to our Uptime instance
if [ $? -eq "0" ]; then
wget -O- "https://uptime.svc/api/push/n31a?status=up&msg=OK&ping="
fi
# Delete any copies older than 14 days
find /backup/k0s -mtime +14 -delete
volumeMounts:
- mountPath: /share
name: share
- mountPath: /backup
name: nfs-backup
- name: k0sctl-cfg
mountPath: /k0sctl.yaml
subPath: k0sctl.yaml
Finis.
This should get us an off-cluster copy of our Kubernetes state every hour.
It should also demo the power of the sidecar design pattern, and how to abuse
use tailscale to get things done.
If you have something you need authenticated via ssh, and made network accessible in a safe fashion you now have everything you need to do from within a Kubernetes cluster.