I know a bio scientist who spent two months working on containers and Docker and what not, for what are essentially independent shell scripts that are best run as batch jobs. I spoke with him in length, and he realized at the end of the day that what he really needed was better understanding of standard *nix processes and not containers...
- automated ssl
- centralized logging
- super easy scaling up and down (going from 1 instance to 2 instance is a headache manually)
- centralized auth (have a service which doesnt have built in auth?)
- super easy recovery (containers recreated/volumes attached automatically/apps just work unless you have db type of application which shouldn't be just restarted, which is rare)
- Smooth CI/CD (gogs/jenkins/private docker image repository)
As for the "execute query" example, why is it such a headache ? I just "kubectl ssh <container>" and I am in.
> I know a bio scientist who spent
Super obscure example. Kubernetes is definitely not for non-tech people. And I didn't pick k8s overnight. Spent a year doing playful POCs before deploying in a production environment.
If the only thing thats stopping you from using k8s is the learning curve, I suggest you to go ahead and learn it. Its a huge boon.
I question myself whether it is necessary to hide my operational needs behind a behemoth of complexity like Kubernetes. The list of conveniences you mentioned sounds like magic you get from Kubernetes. What if there is a problem with any of them?
Auth failures? or worse, failure of Auth system?
Easy recovery? what if there were failures to checkpoint/backup/snapshot containers?
CI/CD is good regardless of whether you use Kubernetes or not.
EDIT: The question is, if you have any of these problems, why is it better to get your head into how Kubernetes deals with those operations & tools rather than dealing with well-defined unix tools that specialise in doing these jobs. syslog instead of however Kubernetes gathers the logs, reading FreeIPA docs instead of Kubernetes auth system logs?
My point is, that to deal with all of the conveniences you mentioned, you need to know their details anyway. Why rely on Kubernetes abstraction if there is no such need?
(I'm not trying to being snarky. I'm genuinely curious why you think it is a good idea. If you convince me otherwise, perhaps I would start adopting Kubernetes as well.)
I run my cluster(I'm a sysadmin) with
a couple of OpenBSD server that runs redundant DNS and DHCP.
a CentOS 7 box that runs FreeIPA as central Auth.
an OpenBSD server that acts as public facing SSH server.
about 20 nodes, all provisioned using kickstart files, and then configured using Ansible. They run databases, web servers, batch jobs, Git, etc.
A single server that runs ELK stack for log analysis.
A single server that runs Prometheus for quick aggregated monitoring.
Do you think I should switch over to Kubernetes for any benefits?
I'm a sysadmin who manages thousands of bare metal machines (A touch less than 10,000 Linux boxes). We have gotten to a point in some areas where you can't linearly scale out the app operations teams by hiring more employees so we started looking at container orchestration systems some time ago (I started working on Mesos about 3 years ago before Kubernetes was any good). As a result, I got to understand the ecosystem and set of tools / methodology fairly well. Kelsey Hightower convinced me to switch from Mesos to Kubernetes in the hallway at the Monitorama conference a few years back. Best possible decision I could have made in hindsight.
Kubernetes can't run all of our applications, but it solves a huge class of problems we were having in other areas. Simply moving from a large set of statically provisioned services to simple service discovery is life changing for a lot of teams. Especially when they're struggling to accurately change 200 configs when a node a critical service was running has a cpu fault and panics + reboots. Could we have done this without kubernetes? Sure, but we wanted to just get the teams thinking about better ways to solve their problems that involved automation vs more manual messing around. Centralized logging? Already have that. Failure of an Auth system? No different than without Kubernetes, you can use sssd to cache LDAP / Kerberos locally. Missing logs? No different than without kubernetes, etc. For us, Kubernetes solves a LOT of our headaches. We can come up with a nice templated "pattern" for continuous delivery of a service and give that template to less technical teams, who find it wonderful. Oh, and we run it bare metal on premise. It wasn't a decision we took lightly, but having used k8s in production for about 9 months, it was the right one for us.
> Kubernetes Is a Surprisingly Affordable Platform for Personal Projects
with a counter that
> They're Google wannabies that thinks in Google's scale but forget that it is utterly unnecessary in their case.
I would posit that at the point you have over a hundred (or a couple hundred) servers, that "Google wannabies" applies much less and you have reason to use the Kubernetes features. But I wouldn't expect most personal projects to get anywhere near that threshold.
Hell, I bet the vast majority of personal projects happily sit on one server that does all their needs and they have room to grow on that server, or a larger instance of the one server. Possibly a second one spun up occasionally for specialized processing needs until it's destroyed.
I run my personal web stuff in a docker container per application (running as a dedicated user per app via docker's --user=username:group) which iptables rules per user. Kubernetes would work, but is overkill for the 5 vhosts I run via a pretty stripped down Apache config.
This terminology is confusing to me as someone who's worked in the embedded space. In that field bare-metal implies not running an operating system. So does bare-metal Linux box mean you rent a box and stick a distro on it yourself? I feel like there could be more precise terminology used in that case...
Thousands of physical servers our company manages that aren't rented, but owned by us. Does that help?
Do you need to do rolling deployments? Boom. Kubernetes pays for itself right there. And that's just tip of the iceberg. Do you need someone else to join you on ops duties? Boom. If they know Kubernetes, they can easily find their way around your system. Your bespoke system? They're trying to grok your lack of documentation.
It's a latency/bandwidth thing. The learning curve is the latency. The advantages it brings to the table is the bandwidth. As such, I think it's a great investment even for small businesses/teams, but it's not a good investment for a solo developer who doesn't know it, unless they want to be learning Kubernetes rather than building their project.
> Do you need to do rolling deployments? Boom. Kubernetes pays for itself right there.m
No, you don’t need to do rolling deployments.
> Do you need someone else to join you on ops duties? Boom. If they know Kubernetes, they can easily find their way around your system.
Easier than ssh’ing in and running ps?
Don’t get me wrong, k8 is great if you need to manage a complex system. It’s just that most of the time you can choose to not build a complex system in the first place.
Have you ever been thrown into an unknown environment? SSH'ing where exactly? Oh but this runs at hosting partner X? That runs at hosting partner Y? Oh but this service is actually behind a reverse proxy running on box Z. Oh you need to RDP to a jump-server to SSH to those machines? Documentation? You mean these outdated Word documents and Excel sheets? And down the rabbit-hole you go. Fun!
And don't say that doesn't happen, I'm a freelance sysadmin - and am exactly in an environment like that right now, it's not the first, and won't be the last. To get a complete picture here, I needed 2 whole months for everything outside the Openshift clusters.
The stuff running on openshift was pretty simple, there is the master, here are your credentials - and that's it. The rest is pretty easy to figure out if you've worked with k8s/openshift. The biggest problem was knowing if the app in question ran on Openshift or somewhere else.
I’m not arguing that k8s is the wrong tool to manage that kind of complexity. I’m arguing that, in almost all cases, that kind of complexity is completely unwarranted.
In a way, it turns all sorts of things into simple algebra - ax + b. So time = scope / bandwidth + latency. If latency dominates, it takes longer. If bandwidth dominates, goes faster.
A unified interface for things like "find where this service is running". One standardised way to do it - even if Kubernetes were nothing more than a "install log system x and auth service y and dns server z" there would be a lot of value in that. You talk about "well-defined unix tools" but IME the unix tools are a lot less well-defined than container tooling - e.g. there are several different log systems that describe themselves as "syslog-compatible" but it's not at all clear what that means and not every application that claims to work with "syslog" will work with every service that claims to be syslog-compatible.
> about 20 nodes, all provisioned using kickstart files, and then configured using Ansible. They run databases, web servers, batch jobs, Git, etc.
Did you face the same questions from other people when you adopted Kickstart? What were your answers then?
The setup you've described is probably 80% of the way to Kubenetes compared to a traditional sysadmin approach. Kubernetes will save you a bit of work in terms of eliminating your "script for infrastructure changes like DNS & DHCP" (which I would suspect is one of the more error-prone parts of your setup? It tends to be, IME) and the need to manually allocate services to classes of hosts (ansible playbooks tend to end up as "we want 3 hosts in this class that run x, z and w, and 2 hosts in this class that runs v and y", whereas kubernetes is more "here's a pool of hardware, service x requires 2GB of RAM and needs to run on 3 hosts, you sort it out"). Whether that's worth the cost of migrating what sounds like a fairly recently updated infrastructure setup is another question though.
Now personally, I'd rather kubectl get pods --all-namespaces, figure out where the log collector is, what's wrong with it, and fix it, but instead I'm probably going to be reading your docs and trying to figure out where these things are by the time I've fixed it on a kube cluster.
> It works for you, but it's going to include a lot of toil when someone replaces you.
I was thinking that Ansible + FreeIPA(RedHat enterprise product) + Elastic logging setup(ELK) + Prometheus would be easier for my successor to deal with than figuring out my bespoke setup of Kubernetes(which keeps adding new features every so often). Even if I did not create proper docs(I do my best to have good doc of whatever I do), my sucessor would be better off relying on RedHat's specific documentation rather than guessing what I did to a Kubernetes version from 6 months ago...
If something breaks in FreeIPA or Unbound(DNS) or Ansible, it is very much easier to ask targeted questions on StackOverflow or lookup their appropriate manuals. They don't change as often as Kubernetes does. Don't you think?
Alternatively, if something breaks on Kubernetes, you'd have to start digging Kubernetes implementation of whatever feature it is, and hope that the main product hasn't moved on to the next updated release.
Is it not the case? Is Kubernetes standard enough that their manuals are RedHat quality and is there always a direct way of figuring out what is wrong or what the configuration options are?
Here I was thinking that my successor would hate me if I built a bespoke Kubernetes cluster rather than standard enterprise components such as the ones I listed above.
No worries. The fact you asked the question is a positive, even if we end of agreeing to disagree.
> I was thinking that Ansible + FreeIPA(RedHat enterprise product) + Elastic logging setup(ELK) + Prometheus would be easier for my successor to deal with than figuring out my bespoke setup of Kubernetes(which keeps adding new features every so often). Even if I did not create proper docs(I do my best to have good doc of whatever I do), my sucessor would be better off relying on RedHat's specific documentation rather than guessing what I did to a Kubernetes version from 6 months ago...
So both FreeIPA and ELK would be things we would install onto a kube cluster, which is rather what I was commenting about. When either peice of software has issues on a kubernetes cluster, I can trivially use kubernetes to find, exec and repair these. I know how they run (they're kubernetes pods) and I can see the spec of how they run based on the kubernetes manifest and Dockerfile. I know where to look for these things, because everything runs in the same way in kubernetes. If you've used an upstream chart, such as in the case of prometheus, even better.
For things that aren't trivial we still need to learn how the software works. All kubernetes is solving is me figuring out how you've hosted these things, which can be done either well or poorly, and documented either well or poorly, but with kube, it's largely just an API object you can look at.
> Is it not the case? Is Kubernetes standard enough that their manuals are RedHat quality and is there always a direct way of figuring out what is wrong or what the configuration options are?
Redhat sells kubernetes. They call it openshift. The docs are well written in my opinion.
Bigger picture is if you're running a kubernetes cluster, you run the kubernetes cluster. You should be an expert in this, much the same way you need to be an expert in chef and puppet. This isn't the useful part of the stack, the useful part is running apps. This is where kubernetes makes things easier. Assuming your bespoke kubernetes itself is a different thing. Use a Managed Service if you're a small org, and a popular/standard solution if you're building it yourself.
Reading through your response already showed that at-least some of my understanding of Kubernetes was wrong and that I need to look into it further. I was assuming that Kubernetes would encompass the auth provider, logging provider and such. Honestly, it drew a parallel to systemd in my mind, trying to be this "I do everything" mess. The one day workshop I attended at Google gave me that impression as it involved setting up an api server, ingress controller, logging container(something to do with StackDriver that Google had internally), and more for running a hello-world application. That formed my opinion that it was too many moving parts than necessary.
If there is a minimal abstraction of Kubernetes, that just orchestrates the operation of my standard components(FreeIPA, nginx, Postgres, Git, batch compute nodes), then it is different than what I saw it to be.
> if you're running a kubernetes cluster, you run the kubernetes cluster. You should be an expert in this, much the same way you need to be an expert in chef and puppet.
I think that is the key. End of the day, it becomes a value proposition. If I run all my components manually, I need to babysit them in operation. Kubernetes could take care of some of the babysitting, but the rest of times, I need to be a Kubernetes expert to babysit Kubernetes itself. I need to decide whether the convenience of running everything as containers from yaml files is worth the complexity of becoming an expert at Kubernetes, and the added moving parts(api server, etcd etc.).
I will play with Kubernetes in my spare time on spare machines to make such a call myself. Thanks for sharing your experience.
Ingress controller and log shippers are pluggable. I'd say most stacks are going to want both, so it makes sense for a MSP to provide these, but you can roll your own. You roll your own, and install the upstream components the same way you run anything on the cluster. Basically, you're dogfooding everything once you get past the basic distributed scheduler components.
> I think that is the key. End of the day, it becomes a value proposition. If I run all my components manually, I need to babysit them in operation. Kubernetes could take care of some of the babysitting, but the rest of times, I need to be a Kubernetes expert to babysit Kubernetes itself.
So it depends what you do. Most of us have an end goal of delivering some sort of product and thus we don't really need to run the clusters ourselves. Much the same way we don't run the underlying AWS components.
Personally, I run them and find them quite reliable, so I get babysitting at a low cost, but I also do know how to debug a cluster, and how to use one as a developer. I don't think running the clusters are for everyone, but if running apps is what you do for a living, a solution like this is most for all levels of stacks. Once you have the problem of running all sorts of apps for various dev teams, a platform will really shine, and once you have the knowledge, you probably won't go back to gluing solutions together
Openshift is something Redhat is pushing heavily, and the documentation is extremely good. Also, several of the openshift devs frequently comment here on HN.
Kube was used to enforce standards, and make things supportable by a wider number of people, which is a weird thing to say since barely anyone used it at the time.
This was a large success though, as we only really needed to train people to use kubernetes once, and then it was over to worrying about the actual applications.
Your average guy with unix skills that works on small deploys probably doesn't want to learn kubernetes, once they work on larger deployments, they tend to convert in my experience.
If your entry point is knowing Kubernetes really well but not knowing traditional stacks as well, the Kubernetes stack will probably make a lot more sense to you. Naturally, if you've never touched k8s, it will be a lot more confusing to troubleshoot a multi-tier app cluster on it than to troubleshoot on a traditional setup, even if you have to read the docs on the former first.
Obviously the abstraction has a lot of advantages; there are reasons people are moving towards it, but as in the case of this blogpost, I think a lot of people are using it as an entry point/shortcut for 'learning' systems that are more complex than they realize. That's not a bad thing; again, abstraction is great for many things. Hopefully, with time, they'll learn to troubleshoot the underpinnings as they get deeper and deeper into the tech.
Personally, I've used both, but many of the traditional stacks are manually configured blobs with abstactions in the wrong place and I'm happy that your average sysadmin no longer needs to build these.
As a former maintainer of the saltstack config management software (and still occasional contributor): https://github.com/saltstack/salt/commits?author=SEJeff
I find that kubernetes is a better fit for a lot of the applications which previously had dedicated hardware that ran at 0.00% utilization all day. It is also a much better user experience for developers as it was previously along the lines of:
1. User contacts linux team and asks for a server to run say a webapp but it needs redis
2. Linux team sees the ticket and realizes there is no hardware to run dedicated app, so they ask the Datacenter team to get a server.
3. There are no unused servers in the warehouse, so the Datacenter team gets purchasing to send a PO to the server vendor, and in a week we have a new server.
4. The Datacenter team racks the server, and goes back and forth with the Linux team until the network config is correct from a physical standpoint.
5. The Linux team builds out the box and configures the services the user requests, with config management and some fancy tunables for that user.
6. The user gets the server only to find a few things are misconfigured they don't have access to change and go back to step 5.
7. The app works, but the user has to ask the Linux team to update internal DNS for their new service (adding a CNAME generally).
The process now involves:
1. User creates container images of their apps and pushes to our internal registry.
2. User looks at the copious example application templates and makes their app run in kubernetes.
4. Their app runs, but if they have questions, the Linux team will help them with specific requirements. They get automatic dns via coredns, which runs in every kubernetes cluster and creates dns for each service that runs.
5. They spend more time prototyping new software than they do worrying about deployment.
K8s definitively helps with better resource usage and it can make deployment much more straightforward but it doesn't abstract the need to think of capacity and provision hardware.
I'm not pretending I'm not conflating things in this example, but for us, it was a way to solve a lot of "legacy problems" but forcing a move to a new paradigm. So far, everyone loves it (with a few exceptions!)
> kubectl get pods --all-namespaces
so familiarity is not a good argument here
Even that question is too broadly posed. The question is, is learning it in addition to the previously-standard tools useful?
 In the context of the article, this might be "instead of", which presents a much lower bar.
Neither one is obvious to a sysadmin who hasn't worked in either world.
Kubernetes allows me to express myself concisely and effectively on the level of complexity I will eventually always encounter in the job, without the particularities of certain tools or manual execution of certain tasks, so for me the complexity is less with Kubernetes.
The author answers this--learning how to stitch together a bunch of disparate unix tools into a smoothly-operating system requires a whole bunch of professional experience. The author claims that k8s requires the same amount of experience but scales; personally, I think he's being too charitable (or to take a different perspective, he undervalues the difficulty of being a traditional sysadmin). Perhaps he meant "for a single machine". In my opinion, once you extend beyond one machine, k8s is quite a lot simpler.
Can you go touch your boxes? Who can add a new one and how long.
As i understand it k8s is kind of designed for renting a bunch of AWS boxes and just having "my" cluster look like it operates seperately (the traefik router proxy comes to mind as something K8S should do)
I speak as a complete K8s novice.
> Can you go touch your boxes? Who can add a new one and how long.
This is surprisingly not that long. I can order new machines and they are delivered for me to use in just 2 weeks. With a little bit of planning, I can squeeze a lot of performance out of them. For example, I can avoid noisy neighbours because I can control what is deployed where physically.
Another benefit of non-cloud is that I can customise the machines for their purpose. I recently built a FreeBSD/ZFS based server and chose a cheaper CPU and a lot of faster RAM to deal with. For DNS servers, queue servers and such I chose a CPU that has a higher single threaded performance at higher clock rates than ones with more cores and slower clock rates.
I know which price is more expensive :-)
Our workload is not strictly attached to physical resource. So, we have our updates scheduled for every week, with the condition that updates run when there are no user tasks scheduled on them.
The other nodes that serve a specific purpose - like database servers, app servers etc, get updated regularly for security updates as & when they are available and necessary, and checked for version upgrades (with scheduled downtime) once every 4 months.
It's slow as hell.
> I had an excellent time working with kubernetes
I'm happy for you, seriously, and I'm not claiming general validity of my experience. What I am disputing is the blog post (which does claim general validity).
Source: I am also using Kubernetes in production, migrating off it soon.
The largest reason is cost. My small deployment (3 nodes) is running around $100 / mo on AWS (That's my app, nginx, redis, and postgres).
It doesn't even need 3 nodes, I don't recall if 3 is the minimum, but realistically I only need one (for now). For larger projects this is probably a non-issue.
Second largest reason, is that really I have no idea what is going on on these nodes, and I probably never will. Magically my services run when I have the correct configurations. Not to say that's always a bad thing, but I've found it difficult to determine the default level of security as a result of this.
A third reason is the learning curve. This is less of an issue because I've invested the time to learn already. But like, the first time I tried to get traffic ingressed was painful.
As to what I'm moving to, I migrated one of my websites to a simple rsync + docker-compose setup and am pretty happy with it. In the past, I ran ansible on a 50 node cluster and it worked really well.
I suspect we're going to continue to see layers built on top of kubernetes, such as gitlabs autodevops, where for the simple things, you don't need to write any of that, but you can still inspect what they create, and jump off the rails for things that are a bit more complex.
> Spent a year
> go ahead and learn it. Its a huge boon.
for green field deployment with a 1 year R&D budget for devops, sure K8s is great choice perhaps, but for the rest of us (even in tech)?
I mean, sure, kubernetes _is_ complex. But it's complex in the way that using Rails is more complex than pure Ruby. It's fine when you're playing, or you have a weird subset of problems where you don't care if someone runs or not, but as soon as you deal with the problems of HA, DR, live deployments, secret management, sharing servers, working in a team, etc then if you're solving these issues with ansible and terraform, you're probably just inventing something as complex, and worse, bespoke to your problem.
At the end of the day, once you've learned to use kube, it's probably no worse than any other configuration as code based system, and the question is which of us should be learning it and when, and which of us should just be using higher levels of abstractions like gitlabs autodevops.
Now, indeed, if you're just hacking and you have no need for a CI and CD pipeline, or reproducibility, etc, then sure, it's probably not the time to learn a hosting framework, but once you do know it, you're free to make an informed opinion on your problem.
Personally, I sell kube a lot, but I tend to work for orgs that have several development teams and several apps, and bringing them to a platform seems like a good idea. The question is, should I also put a startup on GKE, or should I do something bespoke where they can't hire another contractor to take over as easily? Personally, I'd go GKE.
GKE takes a lot of the load off by making it Google's problem. The possible downside (which may or may not matter depending on use case) is that you're tied to what Google's control plane provides.
So if you want a shiney admission controller, you're out of luck unless they support it.
But for basic workloads, it's likely to work well and their management solves a lot of vanilla k8s pain points.
The amount of hours you have to spend learning how k8s works just to set it up properly from scratch will far exceed the r&d and operational cost of a less sophisticated small project set up from scratch.
Pay for managed services, or do it as simple as possible.
One of it's main functionalities is to eliminate the complexities of getting going with automated software delivery by automatically setting up the pipeline and necessary integrations, freeing up yourself to focus on the culture part. That means everyone can skip the manual work of configuration, and focus on the creative and human aspects of software creation.
Here's the doc with more info about it https://docs.gitlab.com/ee/topics/autodevops/
Jobs actually run in their own kubernetes pods, and since this is seperate to the master, it helps go a long way to actually having something that resembles a secure SCM, CI and CD.
I don't work for for gitlab, just someone who steals their hard work and sells it to clients. :)
If you have one customer who need it once a week, you add this find-grep-awk script to xinetd and set the PHP page with the couple of fields setting the arguments for the request.
If you have a million of customers per hour, you setup a bunch of terabyte-RAM servers with realtime Scala pipeline, and hire a team to tweak it day and night.
Because that's a two very different problems. And the worst thing to do is to try to solve the problem X with a tools for problem Y.
If you want to batch on more than lines, use sponge, or write another few lines of mawk/perl/whatever.
Those are limited examples, and may not always be The Right Way (tm), but there are certainly easy, old, simple ways to take shell pipelines and make the data available quickly in some other store.
Over the years you easily end up with a pile of these projects, which want to keep running mostly for fun and therefore want to minimize the associated fixed costs. Using containers may help in keeping the herd manageable so that you can, for example, move everything to some other hosting provider.
Interestingly, while the author compares k8s to an SQL database, they actually do not deploy a DB. It's all fun an giggles until you deploy a DB.
Or, as worst, the shortcut to understanding the underlying tech. But, isn't that the point of abstraction?
and the people who, when they show up at their Google interview, nail the part about scaling. Not that Google ever asked me how to scale a project, but it doesn't hurt learning things just for the sake of learning.
Meaning they _unnecessarily_ think in Google's scale when there is no need for it.
Having interviewed many tenths of engineers at Amazon - no, not really.
The 5-15 issue is a self inflicted wound k8s brings to the party that is quite interesting to work around
Most real scalability issues exist on lower levels and require more theoretical thinking than experimenting with tools.
This statement strikes me as likely to be true, at least for database non-experts.
It was certainly true for me. I didn't have anything approaching a decent understanding of query performance until I discovered posgresql's EXPLAIN and, perhaps more importantly, EXPLAIN ANALYZE.