We’ve basically solved this where I work, with these steps:
- Each environment gets its own directory. We use kustomize to share config between environments.
- direnv “sets” the current context when you cd under a cluster’s directory (it sets an environment variable that a kubectl alias uses. Nobody calls kubectl directly; it wouldn’t work because we’ve banned the current context from the yaml files). You switch clusters by changing to that cluster’s directory.
- most of the time, the only command you run is ‘make’ which just does kubectl kustomize apply (or whatever). 100% of cluster config is checked into git (with git-crypt for secrets), so the worst that can happen is that you reapply something that’s already there.
I’ve also colored the command prompt according to the current cluster.
But anyway it’s essentially impossible to apply a config change to the wrong cluster. I haven’t worried about this in years.
I got burned by this recently and came to the conclusion that the concept of a current context is evil. Now I always specify —-context when running kubectl commands.
I also got burned by this, pretty badly, and ever since it happened, I don't even have a default kubeconfig, have to specify it for every single kubectl run.
I never even set up a default context. I sussed out that problem from the get-go and always use `--context`. But that's not really enough if you use shell history, or if your clusters differ in few letters that are easy to typo.
I also tried to avoid current context initially, but it just slowed me down. Switching between clusters is so much easier with the current context and kubectx.
That’s why I built kubesafe. In this way I can keep using the current context without worrying about screwing up. If I accidentally target the wrong context, at least I get a warning before executing the command.
The only hassle now is remembering to add new prod contexts to the safe list, but that’s about to change with regex support coming soon :)
I found that some cloud providers and other tools like minikube don't play nice with other clusters in the same config. I now use a tiny shell function that selects KUBECONFIG out of a folder, and adds the current cluster name's to my prompt.
This is a good suggestion, but keep in mind that you can accidentally run a command in the wrong directory. I've certainly done that too, with painful results.
If I’m doing something more involved, I’ve got a k9s window open in another pane, making sure the command is having the intended effect.
I guess the riskiest commands would be things like deleting persistent volumes. But our storage class doesn’t automatically clean up the disk in the cloud provider, so we could recover from that too.
We’ve avoided that situation with kustomize. Common resources go into a ‘bases’ directory, and if two clusters have identical resources, then they both have their own directories and reference all the base resources from there.
In practice, there are always slight differences between cluster config between test and prod (using different S3 buckets, for example) so this is needed anyway.
Don't keep anything in the default .kube/config. Set KUBECONFIG envar instead. Keep every cluster in separate config. Set an indicator in PS1. Helm et al follow the envar. Roast my zsh:
k8x() {
export env=$1;
# exit if no param
if [ -z $1 ]; then
if [ -z ${KUBECONFIG+x} ]; then
echo "Need param of a k8s environment";
return 1
else
echo "Removing KUBECONFIG variable";
PS1="$(echo "$PS1" | sed -e 's;^([^)]*) ;;')";
unset KUBECONFIG;
return 0
fi
fi;
# exit if no file for param
cfgPath="$HOME/.kube/config.${env}";
if [ ! -f $cfgPath ]; then
echo "A config does not exist";
return 1
fi;
PS1="$(echo "$PS1" | sed -e 's;^([^)]*) ;;' -e 's;^;('$env') ;')";
export KUBECONFIG="$cfgPath";
}
In the early 1990s I ran a math department's 4 servers and 50 workstations and (with a few exceptions) only ever did administrative actions through scripts.
I've worked in lots of places since and the world's matured from scripts and rsync to ansible and puppet and similar.
Have we regressed to the point where we've turned big clusters of systems back into "oops I ran a command as superuser in the wrong directory" ?
Someone here showed me this cool technique with `fzf`:
#!/usr/bin/env bash
set -e
context=$(kubectl config get-contexts | awk '{print $2;}' | grep -v NAME | fzf --preview 'kubectl config use-context {} && kubectl get namespaces')
kubectl config use-context $context
You get a two-pane window with the context on the left and the namespaces on the right. That's all I need to find what I'm looking at. It's destructive, though.
Have been burnt by this, I have to deal with close to 8 clusters and it is very easy to make a mistake.
Would highly recommend kubie, it allows you to switch and shows you the name of the cluster in the prompt. It's probably a more visual way of solving the same problem.
It also solves a problem many of the other solutions here miss: the prompt is printed once and so it can easily be showing stale information if you change the current context in another shell.
With kubie entering a context copies the configuration to a new file and sets KUBECONFIG appropriately, so it is not affected by changes in another shell.
I toyed with the idea of having a kubeconfig per cluster some time ago, but I work with 10s of clusters on a daily basis (often with multiple terminals targeting the same cluster) and having to auth every single time would have been too much of a pain.
Instead I went with kubeswitch which still gives you a different kubeconfig per terminal but allows you to re-use existing sessions.
whether a reauth is necessary depends on your k8s setup
a lot of the cloud ones only configure kubeconfig to call an external command, which can share auth state between terminals
I also have it in my zsh config, but that didn’t stop me from screwing up in the past. Having an active confirmation prompt for potentially risky commands is what works best for me
Hah! I accidentally deleted a production deployment the other day, because I thought it was mucking with my local Colima Kubernetes's cluster. I forgot that I had my context set to one of my AWS clusters. I had been meaning to write a command to wrap helm and kubectrl to prompt me with info before committing, so I will have to take a peek at this.
i added the following to my bashrc a few days ago for similar reasons; this forces me to be explicit about the cluster; now i mess up the wrong namespace instead :)
if [[ -e "/opt/homebrew/bin/kubectl" ]]; then
/opt/homebrew/bin/kubectl config unset current-context >/dev/null
fi
I am not trying to shit on this, sorry - but can't you achieve the same thing with rudimentary automation, and barring that, rudimentary scripting? This seems to just be adding y/n prompts to certain contexts. How's that different than a bash wrapper script that does something like this?
Thanks for the feedback John! You're right, that's pretty much it :)
I developed kubesafe because (1) I was tired of tinkering with shell aliases and scripts (especially when I wanted to define protected commands) and (2) I needed something that worked smoothly with all Kubernetes tools like kubectl, helm, kubecolor, etc.
Kubesafe is just a convenient way to manage protected commands and contexts. Nothing too fancy!
Thanks Robert! Yes, you can achieve this with ACLs in Kubernetes, but it requires setting up multiple Roles and contexts. Even then, you might accidentally switch to a higher-permission Role and accidentally run a risky command, thinking you're in a different cluster or using a low-permission user.
Kubesafe is just an extra safety net to prevent those kind of accidents :)
I think it’s a tradeoff between safety and speed. Having only the CI/CD with production access can significantly slow you down, especially in the early stages when you’re focused on the product and still building out your tooling/infrastructure.
- Each environment gets its own directory. We use kustomize to share config between environments.
- direnv “sets” the current context when you cd under a cluster’s directory (it sets an environment variable that a kubectl alias uses. Nobody calls kubectl directly; it wouldn’t work because we’ve banned the current context from the yaml files). You switch clusters by changing to that cluster’s directory.
- most of the time, the only command you run is ‘make’ which just does kubectl kustomize apply (or whatever). 100% of cluster config is checked into git (with git-crypt for secrets), so the worst that can happen is that you reapply something that’s already there.
I’ve also colored the command prompt according to the current cluster.
But anyway it’s essentially impossible to apply a config change to the wrong cluster. I haven’t worried about this in years.