Kubesafe: Never run Kubernetes commands on the wrong cluster again

physicles · 2024-09-21T13:30:20 1726925420

We’ve basically solved this where I work, with these steps:

- Each environment gets its own directory. We use kustomize to share config between environments.

- direnv “sets” the current context when you cd under a cluster’s directory (it sets an environment variable that a kubectl alias uses. Nobody calls kubectl directly; it wouldn’t work because we’ve banned the current context from the yaml files). You switch clusters by changing to that cluster’s directory.

- most of the time, the only command you run is ‘make’ which just does kubectl kustomize apply (or whatever). 100% of cluster config is checked into git (with git-crypt for secrets), so the worst that can happen is that you reapply something that’s already there.

I’ve also colored the command prompt according to the current cluster.

But anyway it’s essentially impossible to apply a config change to the wrong cluster. I haven’t worried about this in years.

noja · 2024-09-21T19:13:43 1726946023

what if you forget to run direnv the second time?

Gooblebrai · 2024-09-21T20:23:50 1726950230

You don't have to run anything. The point of direnv is that loads/unloads automatically when your enter/leave a directory

ijustlovemath · 2024-09-21T19:22:35 1726946555

I imagine they use an alias or bash function for cd, which uses direnv under the hood

physicles · 2024-09-22T01:56:27 1726970187

Yep. Direnv inserts itself into your prompt function.

ed_mercer · 2024-09-21T06:43:13 1726900993

I got burned by this recently and came to the conclusion that the concept of a current context is evil. Now I always specify —-context when running kubectl commands.

lukaslalinsky · 2024-09-21T12:16:14 1726920974

I also got burned by this, pretty badly, and ever since it happened, I don't even have a default kubeconfig, have to specify it for every single kubectl run.

cryptonector · 2024-09-24T14:40:22 1727188822

I never even set up a default context. I sussed out that problem from the get-go and always use `--context`. But that's not really enough if you use shell history, or if your clusters differ in few letters that are easy to typo.

vdm · 2024-09-21T15:42:11 1726933331

This is the way. Same for awscli profiles.

Telemaco019 · 2024-09-21T16:29:42 1726936182

Got burned too, we've all been there I guess :)

I also tried to avoid current context initially, but it just slowed me down. Switching between clusters is so much easier with the current context and kubectx.

That’s why I built kubesafe. In this way I can keep using the current context without worrying about screwing up. If I accidentally target the wrong context, at least I get a warning before executing the command.

The only hassle now is remembering to add new prod contexts to the safe list, but that’s about to change with regex support coming soon :)

cyberpunk · 2024-09-21T10:54:18 1726916058

I actually go a step further and keep multiple kubeconfigs and have a load of shell aliases for managing them.

Active one is in $PS1 somewhere.

remram · 2024-09-21T14:43:00 1726929780

I found that some cloud providers and other tools like minikube don't play nice with other clusters in the same config. I now use a tiny shell function that selects KUBECONFIG out of a folder, and adds the current cluster name's to my prompt.

physicles · 2024-09-21T13:31:23 1726925483

Check out direnv, and use a shell alias for kubectl. And yeah, current context is evil.

8organicbits · 2024-09-21T13:54:03 1726926843

This is a good suggestion, but keep in mind that you can accidentally run a command in the wrong directory. I've certainly done that too, with painful results.

physicles · 2024-09-22T02:08:15 1726970895

What kind of command was it?

If I’m doing something more involved, I’ve got a k9s window open in another pane, making sure the command is having the intended effect.

I guess the riskiest commands would be things like deleting persistent volumes. But our storage class doesn’t automatically clean up the disk in the cloud provider, so we could recover from that too.

remram · 2024-09-21T14:43:40 1726929820

What if you have dev and prod clusters/namespaces for the same project (and thus directory)?

physicles · 2024-09-22T02:01:54 1726970514

We’ve avoided that situation with kustomize. Common resources go into a ‘bases’ directory, and if two clusters have identical resources, then they both have their own directories and reference all the base resources from there.

In practice, there are always slight differences between cluster config between test and prod (using different S3 buckets, for example) so this is needed anyway.

rad_gruchalski · 2024-09-21T18:35:44 1726943744

I just print the current context in my shell, next to the git branch.

adolph · 2024-09-21T19:08:00 1726945680

Don't keep anything in the default .kube/config. Set KUBECONFIG envar instead. Keep every cluster in separate config. Set an indicator in PS1. Helm et al follow the envar. Roast my zsh:

  k8x() {
    export env=$1;
  # exit if no param
    if [ -z $1 ]; then
      if [ -z ${KUBECONFIG+x} ]; then 
        echo "Need param of a k8s environment";
        return 1
      else 
        echo "Removing KUBECONFIG variable";
        PS1="$(echo "$PS1" | sed -e 's;^([^)]*) ;;')";
        unset KUBECONFIG;
        return 0
      fi
    fi;
  # exit if no file for param
    cfgPath="$HOME/.kube/config.${env}";
    if [ ! -f $cfgPath ]; then
      echo "A config does not exist";
      return 1
    fi;
    PS1="$(echo "$PS1" | sed -e 's;^([^)]*) ;;' -e 's;^;('$env') ;')";
    export KUBECONFIG="$cfgPath";
  }

cduzz · 2024-09-21T11:44:23 1726919063

In the early 1990s I ran a math department's 4 servers and 50 workstations and (with a few exceptions) only ever did administrative actions through scripts.

I've worked in lots of places since and the world's matured from scripts and rsync to ansible and puppet and similar.

Have we regressed to the point where we've turned big clusters of systems back into "oops I ran a command as superuser in the wrong directory" ?

renewiltord · 2024-09-21T07:58:19 1726905499

Someone here showed me this cool technique with `fzf`:

    #!/usr/bin/env bash

    set -e

    context=$(kubectl config get-contexts | awk '{print $2;}' | grep -v NAME | fzf --preview 'kubectl config use-context {} && kubectl get namespaces')
    kubectl config use-context $context

You get a two-pane window with the context on the left and the namespaces on the right. That's all I need to find what I'm looking at. It's destructive, though.

evnix · 2024-09-21T07:45:20 1726904720

Have been burnt by this, I have to deal with close to 8 clusters and it is very easy to make a mistake.

Would highly recommend kubie, it allows you to switch and shows you the name of the cluster in the prompt. It's probably a more visual way of solving the same problem.

https://github.com/sbstp/kubie

terinjokes · 2024-09-21T15:11:15 1726931475

It also solves a problem many of the other solutions here miss: the prompt is printed once and so it can easily be showing stale information if you change the current context in another shell.

With kubie entering a context copies the configuration to a new file and sets KUBECONFIG appropriately, so it is not affected by changes in another shell.

Yasuraka · 2024-09-21T10:15:46 1726913746

I do this with kubectx to switch and kube-ps1 with ohmyzsh to display cluster/namespace in my usual prompt

glitchcrab · 2024-09-21T10:45:46 1726915546

'close to 8 clusters' is a strange turn of phrase. So you manage 6 or 7?

elygre · 2024-09-21T12:51:01 1726923061

It might also be a trick — maybe it’s nine?

evnix · 2024-09-21T20:42:50 1726951370

7 during good times. 8 when things go south.

glitchcrab · 2024-09-21T10:51:50 1726915910

I toyed with the idea of having a kubeconfig per cluster some time ago, but I work with 10s of clusters on a daily basis (often with multiple terminals targeting the same cluster) and having to auth every single time would have been too much of a pain.

Instead I went with kubeswitch which still gives you a different kubeconfig per terminal but allows you to re-use existing sessions.

https://github.com/danielfoehrKn/kubeswitch

Telemaco019 · 2024-09-21T16:34:48 1726936488

Cool project, I didn't know it. I love the idea, thanks for sharing it!

arccy · 2024-09-21T11:14:31 1726917271

whether a reauth is necessary depends on your k8s setup a lot of the cloud ones only configure kubeconfig to call an external command, which can share auth state between terminals

glitchcrab · 2024-09-21T17:24:25 1726939465

Sure, but I'm switching between AWS, Azure and vSphere clusters regularly and they all behave differently.

decasia · 2024-09-21T14:58:21 1726930701

I like to print the k8s context and current namespace in the shell prompt.

It's still possible I could mess something up with kubectl, but it provides constant reminders of what I'm working with.

Telemaco019 · 2024-09-21T16:32:09 1726936329

I also have it in my zsh config, but that didn’t stop me from screwing up in the past. Having an active confirmation prompt for potentially risky commands is what works best for me

millerm · 2024-09-21T19:27:34 1726946854

Hah! I accidentally deleted a production deployment the other day, because I thought it was mucking with my local Colima Kubernetes's cluster. I forgot that I had my context set to one of my AWS clusters. I had been meaning to write a command to wrap helm and kubectrl to prompt me with info before committing, so I will have to take a peek at this.

thewisenerd · 2024-09-21T04:54:56 1726894496

haha

i added the following to my bashrc a few days ago for similar reasons; this forces me to be explicit about the cluster; now i mess up the wrong namespace instead :)

    if [[ -e "/opt/homebrew/bin/kubectl" ]]; then
        /opt/homebrew/bin/kubectl config unset current-context >/dev/null
    fi

JohnMakin · 2024-09-18T17:27:15 1726680435

I am not trying to shit on this, sorry - but can't you achieve the same thing with rudimentary automation, and barring that, rudimentary scripting? This seems to just be adding y/n prompts to certain contexts. How's that different than a bash wrapper script that does something like this?

context=$(grep "current-context:" ~/.kube/config | grep "*prod*")

if [[ -z ${context} ]]

then # do the command

else # do a y/n prompt

fi

Am I missing something?

Telemaco019 · 2024-09-18T18:09:21 1726682961

Thanks for the feedback John! You're right, that's pretty much it :)

I developed kubesafe because (1) I was tired of tinkering with shell aliases and scripts (especially when I wanted to define protected commands) and (2) I needed something that worked smoothly with all Kubernetes tools like kubectl, helm, kubecolor, etc.

Kubesafe is just a convenient way to manage protected commands and contexts. Nothing too fancy!

Btw - I also found a kubectl plugin written in Bash that’s similar to what you mentioned, in case you're interested: https://github.com/jordanwilson230/kubectl-plugins/blob/krew...

JohnMakin · 2024-09-18T18:14:49 1726683289

thanks for the explanation, I like the idea

Telemaco019 · 2024-09-18T18:19:45 1726683585

You're welcome! And thanks again for the feedback!

jasonhansel · 2024-09-22T01:18:16 1726967896

Can you use this with kubecolor? https://github.com/kubecolor/kubecolor

Incidentally: I have no idea why something like kubecolor isn't built in to kubectl itself.

Telemaco019 · 2024-09-22T08:23:03 1726993383

Absolutely! kubesafe is simply a wrapper, so you can use it with any Kubernetes tool by passing the tool as the first argument to kubesafe.

Example with kubecolor:

`kubesafe kubecolor get pods --all-namespaces`

smlx · 2024-09-21T08:07:38 1726906058

I came up with a simpler solution that keeps kube contexts separated per terminal.

https://smlx.dev/posts/kubectl-global-state/

acedTrex · 2024-09-21T19:05:33 1726945533

I handle this by never keeping production kubeconfigs on my local device. i pull them down on demand.

robertlagrant · 2024-09-18T16:21:31 1726676491

This seems good, but can it also be done via ACLs in vanilla Kubernetes?

Telemaco019 · 2024-09-18T18:15:10 1726683310

Thanks Robert! Yes, you can achieve this with ACLs in Kubernetes, but it requires setting up multiple Roles and contexts. Even then, you might accidentally switch to a higher-permission Role and accidentally run a risky command, thinking you're in a different cluster or using a low-permission user.

Kubesafe is just an extra safety net to prevent those kind of accidents :)

robertlagrant · 2024-09-18T23:13:01 1726701181

That makes sense - thanks for the reply.

coding123 · 2024-09-21T16:28:54 1726936134

Another option, just give prod's creds to CI only.

Telemaco019 · 2024-09-21T17:27:58 1726939678

I think it’s a tradeoff between safety and speed. Having only the CI/CD with production access can significantly slow you down, especially in the early stages when you’re focused on the product and still building out your tooling/infrastructure.