This. Time and again. The number of people who adopt complicated stuff like Kubernetes for what is essentially a couple of web servers and a database is too high. They're Google wannabies that thinks in Google's scale but forget that it is utterly unnecessary in their case.
I know a bio scientist who spent two months working on containers and Docker and what not, for what are essentially independent shell scripts that are best run as batch jobs. I spoke with him in length, and he realized at the end of the day that what he really needed was better understanding of standard *nix processes and not containers...
I had an excellent time working with kubernetes and I am practically a one person company. Kubernetes frees my mind from so many things that now I hate to work without it. Couple of those things include:
- automated ssl
- centralized logging
- super easy scaling up and down (going from 1 instance to 2 instance is a headache manually)
- centralized auth (have a service which doesnt have built in auth?)
- super easy recovery (containers recreated/volumes attached automatically/apps just work unless you have db type of application which shouldn't be just restarted, which is rare)
As for the "execute query" example, why is it such a headache ? I just "kubectl ssh <container>" and I am in.
> I know a bio scientist who spent
Super obscure example. Kubernetes is definitely not for non-tech people. And I didn't pick k8s overnight. Spent a year doing playful POCs before deploying in a production environment.
If the only thing thats stopping you from using k8s is the learning curve, I suggest you to go ahead and learn it. Its a huge boon.
I question myself whether it is necessary to hide my operational needs behind a behemoth of complexity like Kubernetes. The list of conveniences you mentioned sounds like magic you get from Kubernetes. What if there is a problem with any of them?
Missing logs?
Inappropriate scaling?
Auth failures? or worse, failure of Auth system?
Easy recovery? what if there were failures to checkpoint/backup/snapshot containers?
CI/CD is good regardless of whether you use Kubernetes or not.
EDIT: The question is, if you have any of these problems, why is it better to get your head into how Kubernetes deals with those operations & tools rather than dealing with well-defined unix tools that specialise in doing these jobs. syslog instead of however Kubernetes gathers the logs, reading FreeIPA docs instead of Kubernetes auth system logs?
My point is, that to deal with all of the conveniences you mentioned, you need to know their details anyway. Why rely on Kubernetes abstraction if there is no such need?
(I'm not trying to being snarky. I'm genuinely curious why you think it is a good idea. If you convince me otherwise, perhaps I would start adopting Kubernetes as well.)
I run my cluster(I'm a sysadmin) with
a couple of OpenBSD server that runs redundant DNS and DHCP.
a CentOS 7 box that runs FreeIPA as central Auth.
an OpenBSD server that acts as public facing SSH server.
about 20 nodes, all provisioned using kickstart files, and then configured using Ansible. They run databases, web servers, batch jobs, Git, etc.
A single server that runs ELK stack for log analysis.
A single server that runs Prometheus for quick aggregated monitoring.
Do you think I should switch over to Kubernetes for any benefits?
Sounds like what you have works, and Kubernetes might well not benefit you. With roughly 20 nodes, you have more or less 20 "pets" in devops speak and that sounds like an entirely sensible way to manage them. Contrasting with my problem...
I'm a sysadmin who manages thousands of bare metal machines (A touch less than 10,000 Linux boxes). We have gotten to a point in some areas where you can't linearly scale out the app operations teams by hiring more employees so we started looking at container orchestration systems some time ago (I started working on Mesos about 3 years ago before Kubernetes was any good). As a result, I got to understand the ecosystem and set of tools / methodology fairly well. Kelsey Hightower convinced me to switch from Mesos to Kubernetes in the hallway at the Monitorama conference a few years back. Best possible decision I could have made in hindsight.
Kubernetes can't run all of our applications, but it solves a huge class of problems we were having in other areas. Simply moving from a large set of statically provisioned services to simple service discovery is life changing for a lot of teams. Especially when they're struggling to accurately change 200 configs when a node a critical service was running has a cpu fault and panics + reboots. Could we have done this without kubernetes? Sure, but we wanted to just get the teams thinking about better ways to solve their problems that involved automation vs more manual messing around. Centralized logging? Already have that. Failure of an Auth system? No different than without Kubernetes, you can use sssd to cache LDAP / Kerberos locally. Missing logs? No different than without kubernetes, etc. For us, Kubernetes solves a LOT of our headaches. We can come up with a nice templated "pattern" for continuous delivery of a service and give that template to less technical teams, who find it wonderful. Oh, and we run it bare metal on premise. It wasn't a decision we took lightly, but having used k8s in production for about 9 months, it was the right one for us.
> Kubernetes Is a Surprisingly Affordable Platform for Personal Projects
with a counter that
> They're Google wannabies that thinks in Google's scale but forget that it is utterly unnecessary in their case.
I would posit that at the point you have over a hundred (or a couple hundred) servers, that "Google wannabies" applies much less and you have reason to use the Kubernetes features. But I wouldn't expect most personal projects to get anywhere near that threshold.
Hell, I bet the vast majority of personal projects happily sit on one server that does all their needs and they have room to grow on that server, or a larger instance of the one server. Possibly a second one spun up occasionally for specialized processing needs until it's destroyed.
I won't use the term 'vast majority' to stay conservative, but the many, many of enterprise projects would happily work on one server (well, let's make it two identical servers, for redundancy and HA). You can get 2U server with 1,5 TB of RAM, dozens of NVMe drives and tens of cores for really cheap nowadays.
I run my personal web stuff in a docker container per application (running as a dedicated user per app via docker's --user=username:group) which iptables rules per user. Kubernetes would work, but is overkill for the 5 vhosts I run via a pretty stripped down Apache config.
> thousands of bare metal machines (A touch less than 10,000 Linux boxes)
This terminology is confusing to me as someone who's worked in the embedded space. In that field bare-metal implies not running an operating system. So does bare-metal Linux box mean you rent a box and stick a distro on it yourself? I feel like there could be more precise terminology used in that case...
Bare-metal in this context means that you have physical hardware and you're responsible for making sure the system can boot and do the stuff you want, as opposed to what you'd have with a service like Amazon's EC2, where you're given a set of apps to configure and execute a virtual machine image. The distinction is made because the former scenario requires extra work for initial configuration (in terms of OS installation and physical networking and such) and you have the burden of setting up automation to handle scenarios where your OS installation is hosed, and much more.
As others have stated below, in this context you have bare metal and you have the cloud. You could also add in virtul machines, which run either on premise or in the cloud.
Thousands of physical servers our company manages that aren't rented, but owned by us. Does that help?
Yes. Thank you! If only industry marketing hadn't coined the terms, we might have: cloud -> provisioned server, bare-metal -> managed server or some other less context dependent terms.
Just for the curiosity, "managed server" in this area is already taken, and means that you get a sysadmin with the rented server (the server is managed for you, including installation and maintenance of the software you need on it). It is "higher level" than cloud servers, not "lower level" ;)
Do you need to do rolling deployments? Boom. Kubernetes pays for itself right there. And that's just tip of the iceberg. Do you need someone else to join you on ops duties? Boom. If they know Kubernetes, they can easily find their way around your system. Your bespoke system? They're trying to grok your lack of documentation.
It's a latency/bandwidth thing. The learning curve is the latency. The advantages it brings to the table is the bandwidth. As such, I think it's a great investment even for small businesses/teams, but it's not a good investment for a solo developer who doesn't know it, unless they want to be learning Kubernetes rather than building their project.
> Do you need to do rolling deployments? Boom. Kubernetes pays for itself right there.m
No, you don’t need to do rolling deployments.
> Do you need someone else to join you on ops duties? Boom. If they know Kubernetes, they can easily find their way around your system.
Easier than ssh’ing in and running ps?
Don’t get me wrong, k8 is great if you need to manage a complex system. It’s just that most of the time you can choose to not build a complex system in the first place.
Have you ever been thrown into an unknown environment? SSH'ing where exactly? Oh but this runs at hosting partner X? That runs at hosting partner Y? Oh but this service is actually behind a reverse proxy running on box Z. Oh you need to RDP to a jump-server to SSH to those machines? Documentation? You mean these outdated Word documents and Excel sheets? And down the rabbit-hole you go. Fun!
And don't say that doesn't happen, I'm a freelance sysadmin - and am exactly in an environment like that right now, it's not the first, and won't be the last. To get a complete picture here, I needed 2 whole months for everything outside the Openshift clusters.
The stuff running on openshift was pretty simple, there is the master, here are your credentials - and that's it. The rest is pretty easy to figure out if you've worked with k8s/openshift. The biggest problem was knowing if the app in question ran on Openshift or somewhere else.
Yes, you’re describing pretty much every environment I’ve ever worked in.
I’m not arguing that k8s is the wrong tool to manage that kind of complexity. I’m arguing that, in almost all cases, that kind of complexity is completely unwarranted.
Do it! I use the latency/bandwidth thing as an intellectual model for all sorts of surprising situations. It's deeply embedded in my frame of reference for looking at the world, especially for doing any sort of work.
In a way, it turns all sorts of things into simple algebra - ax + b. So time = scope / bandwidth + latency. If latency dominates, it takes longer. If bandwidth dominates, goes faster.
> The question is, if you have any of these problems, why is it better to get your head into how Kubernetes deals with those operations & tools rather than dealing with well-defined unix tools that specialise in doing these jobs. syslog instead of however Kubernetes gathers the logs, reading FreeIPA docs instead of Kubernetes auth system logs?
A unified interface for things like "find where this service is running". One standardised way to do it - even if Kubernetes were nothing more than a "install log system x and auth service y and dns server z" there would be a lot of value in that. You talk about "well-defined unix tools" but IME the unix tools are a lot less well-defined than container tooling - e.g. there are several different log systems that describe themselves as "syslog-compatible" but it's not at all clear what that means and not every application that claims to work with "syslog" will work with every service that claims to be syslog-compatible.
> about 20 nodes, all provisioned using kickstart files, and then configured using Ansible. They run databases, web servers, batch jobs, Git, etc.
Did you face the same questions from other people when you adopted Kickstart? What were your answers then?
The setup you've described is probably 80% of the way to Kubenetes compared to a traditional sysadmin approach. Kubernetes will save you a bit of work in terms of eliminating your "script for infrastructure changes like DNS & DHCP" (which I would suspect is one of the more error-prone parts of your setup? It tends to be, IME) and the need to manually allocate services to classes of hosts (ansible playbooks tend to end up as "we want 3 hosts in this class that run x, z and w, and 2 hosts in this class that runs v and y", whereas kubernetes is more "here's a pool of hardware, service x requires 2GB of RAM and needs to run on 3 hosts, you sort it out"). Whether that's worth the cost of migrating what sounds like a fairly recently updated infrastructure setup is another question though.
Cool. You have something bespoke. It works for you, but it's going to include a lot of toil when someone replaces you.
Now personally, I'd rather kubectl get pods --all-namespaces, figure out where the log collector is, what's wrong with it, and fix it, but instead I'm probably going to be reading your docs and trying to figure out where these things are by the time I've fixed it on a kube cluster.
I'm not sure I understand. Sorry, my exposure to Kubernetes is only a few days and is limited to an overview of all of its components and a workshop by Google.
> It works for you, but it's going to include a lot of toil when someone replaces you.
I was thinking that Ansible + FreeIPA(RedHat enterprise product) + Elastic logging setup(ELK) + Prometheus would be easier for my successor to deal with than figuring out my bespoke setup of Kubernetes(which keeps adding new features every so often). Even if I did not create proper docs(I do my best to have good doc of whatever I do), my sucessor would be better off relying on RedHat's specific documentation rather than guessing what I did to a Kubernetes version from 6 months ago...
If something breaks in FreeIPA or Unbound(DNS) or Ansible, it is very much easier to ask targeted questions on StackOverflow or lookup their appropriate manuals. They don't change as often as Kubernetes does. Don't you think?
Alternatively, if something breaks on Kubernetes, you'd have to start digging Kubernetes implementation of whatever feature it is, and hope that the main product hasn't moved on to the next updated release.
Is it not the case? Is Kubernetes standard enough that their manuals are RedHat quality and is there always a direct way of figuring out what is wrong or what the configuration options are?
Here I was thinking that my successor would hate me if I built a bespoke Kubernetes cluster rather than standard enterprise components such as the ones I listed above.
> I'm not sure I understand. Sorry, my exposure to Kubernetes is only a few days and is limited to an overview of all of its components and a workshop by Google.
No worries. The fact you asked the question is a positive, even if we end of agreeing to disagree.
> I was thinking that Ansible + FreeIPA(RedHat enterprise product) + Elastic logging setup(ELK) + Prometheus would be easier for my successor to deal with than figuring out my bespoke setup of Kubernetes(which keeps adding new features every so often). Even if I did not create proper docs(I do my best to have good doc of whatever I do), my sucessor would be better off relying on RedHat's specific documentation rather than guessing what I did to a Kubernetes version from 6 months ago...
So both FreeIPA and ELK would be things we would install onto a kube cluster, which is rather what I was commenting about. When either peice of software has issues on a kubernetes cluster, I can trivially use kubernetes to find, exec and repair these. I know how they run (they're kubernetes pods) and I can see the spec of how they run based on the kubernetes manifest and Dockerfile. I know where to look for these things, because everything runs in the same way in kubernetes. If you've used an upstream chart, such as in the case of prometheus, even better.
For things that aren't trivial we still need to learn how the software works. All kubernetes is solving is me figuring out how you've hosted these things, which can be done either well or poorly, and documented either well or poorly, but with kube, it's largely just an API object you can look at.
> Is it not the case? Is Kubernetes standard enough that their manuals are RedHat quality and is there always a direct way of figuring out what is wrong or what the configuration options are?
Redhat sells kubernetes. They call it openshift. The docs are well written in my opinion.
Bigger picture is if you're running a kubernetes cluster, you run the kubernetes cluster. You should be an expert in this, much the same way you need to be an expert in chef and puppet. This isn't the useful part of the stack, the useful part is running apps. This is where kubernetes makes things easier. Assuming your bespoke kubernetes itself is a different thing. Use a Managed Service if you're a small org, and a popular/standard solution if you're building it yourself.
Reading through your response already showed that at-least some of my understanding of Kubernetes was wrong and that I need to look into it further. I was assuming that Kubernetes would encompass the auth provider, logging provider and such. Honestly, it drew a parallel to systemd in my mind, trying to be this "I do everything" mess. The one day workshop I attended at Google gave me that impression as it involved setting up an api server, ingress controller, logging container(something to do with StackDriver that Google had internally), and more for running a hello-world application. That formed my opinion that it was too many moving parts than necessary.
If there is a minimal abstraction of Kubernetes, that just orchestrates the operation of my standard components(FreeIPA, nginx, Postgres, Git, batch compute nodes), then it is different than what I saw it to be.
> if you're running a kubernetes cluster, you run the kubernetes cluster. You should be an expert in this, much the same way you need to be an expert in chef and puppet.
I think that is the key. End of the day, it becomes a value proposition. If I run all my components manually, I need to babysit them in operation. Kubernetes could take care of some of the babysitting, but the rest of times, I need to be a Kubernetes expert to babysit Kubernetes itself. I need to decide whether the convenience of running everything as containers from yaml files is worth the complexity of becoming an expert at Kubernetes, and the added moving parts(api server, etcd etc.).
I will play with Kubernetes in my spare time on spare machines to make such a call myself. Thanks for sharing your experience.
> That formed my opinion that it was too many moving parts than necessary.
Ingress controller and log shippers are pluggable. I'd say most stacks are going to want both, so it makes sense for a MSP to provide these, but you can roll your own. You roll your own, and install the upstream components the same way you run anything on the cluster. Basically, you're dogfooding everything once you get past the basic distributed scheduler components.
> I think that is the key. End of the day, it becomes a value proposition. If I run all my components manually, I need to babysit them in operation. Kubernetes could take care of some of the babysitting, but the rest of times, I need to be a Kubernetes expert to babysit Kubernetes itself.
So it depends what you do. Most of us have an end goal of delivering some sort of product and thus we don't really need to run the clusters ourselves. Much the same way we don't run the underlying AWS components.
Personally, I run them and find them quite reliable, so I get babysitting at a low cost, but I also do know how to debug a cluster, and how to use one as a developer. I don't think running the clusters are for everyone, but if running apps is what you do for a living, a solution like this is most for all levels of stacks. Once you have the problem of running all sorts of apps for various dev teams, a platform will really shine, and once you have the knowledge, you probably won't go back to gluing solutions together
True. I cannot generalise like that. I was only thinking about RHEL and IDM(RH version of FreeIPA) - those documentation as super thorough and very helpful IMO.
Openshift is something Redhat is pushing heavily, and the documentation is extremely good. Also, several of the openshift devs frequently comment here on HN.
Indeed, that's part of the point. I first started using kube at a UK Gov Org about 4 years ago. All the units of the org has different standards for hosting, they weren't written down, and the vendors would quote unreasonble sums of money knowing that their systems weren't really supportable by anyone without specific experience of the system.
Kube was used to enforce standards, and make things supportable by a wider number of people, which is a weird thing to say since barely anyone used it at the time.
This was a large success though, as we only really needed to train people to use kubernetes once, and then it was over to worrying about the actual applications.
That's not so bespoke, that's a very much standard software, services and operating systems, which is very familiar to every UNIX sysadmin. It would be much easier to pass that system to new sysadmin than Kubernetes cluster. By far.
It's not that bespoke, but it is a bunch of building blocks glued together, rather than being a framework, and thus one needs to figure out which blocks you've glued together.
Your average guy with unix skills that works on small deploys probably doesn't want to learn kubernetes, once they work on larger deployments, they tend to convert in my experience.
I'm just learning Kubernetes now, but I've managed multi-thousand node VMware deployments, and a lot of these arguments seem to boil down to your entry point.
If your entry point is knowing Kubernetes really well but not knowing traditional stacks as well, the Kubernetes stack will probably make a lot more sense to you. Naturally, if you've never touched k8s, it will be a lot more confusing to troubleshoot a multi-tier app cluster on it than to troubleshoot on a traditional setup, even if you have to read the docs on the former first.
Obviously the abstraction has a lot of advantages; there are reasons people are moving towards it, but as in the case of this blogpost, I think a lot of people are using it as an entry point/shortcut for 'learning' systems that are more complex than they realize. That's not a bad thing; again, abstraction is great for many things. Hopefully, with time, they'll learn to troubleshoot the underpinnings as they get deeper and deeper into the tech.
The thing you know will always seem simpler than the thing you don't, but I think it's simply a convention versus configuration argument and where you want the abstactions.
Personally, I've used both, but many of the traditional stacks are manually configured blobs with abstactions in the wrong place and I'm happy that your average sysadmin no longer needs to build these.
I find that kubernetes is a better fit for a lot of the applications which previously had dedicated hardware that ran at 0.00% utilization all day. It is also a much better user experience for developers as it was previously along the lines of:
1. User contacts linux team and asks for a server to run say a webapp but it needs redis
2. Linux team sees the ticket and realizes there is no hardware to run dedicated app, so they ask the Datacenter team to get a server.
3. There are no unused servers in the warehouse, so the Datacenter team gets purchasing to send a PO to the server vendor, and in a week we have a new server.
4. The Datacenter team racks the server, and goes back and forth with the Linux team until the network config is correct from a physical standpoint.
5. The Linux team builds out the box and configures the services the user requests, with config management and some fancy tunables for that user.
6. The user gets the server only to find a few things are misconfigured they don't have access to change and go back to step 5.
7. The app works, but the user has to ask the Linux team to update internal DNS for their new service (adding a CNAME generally).
The process now involves:
1. User creates container images of their apps and pushes to our internal registry.
2. User looks at the copious example application templates and makes their app run in kubernetes.
4. Their app runs, but if they have questions, the Linux team will help them with specific requirements. They get automatic dns via coredns, which runs in every kubernetes cluster and creates dns for each service that runs.
5. They spend more time prototyping new software than they do worrying about deployment.
Provisioning hardware still matters depending on requirements. I think you oversimplified a bit here. The user in your scenario probably needs to talk to the devops team and discuss expected resource needs unless it's really a tiny webapp that could have run on a shared server in the past anyway or in a VM.
K8s definitively helps with better resource usage and it can make deployment much more straightforward but it doesn't abstract the need to think of capacity and provision hardware.
We have LimitRange and ResourceQuotas setup per namespace in kubernetes. Each team talks with us about expected needs before we permission them with the ability to login to kubernetes and create a namespace for them to begin with. If a team needs more of either, we can up them. We have grafana dashboards of the cluster utilization and will proactively order extra servers to add capacity as needed. So far, so good!
I'm not pretending I'm not conflating things in this example, but for us, it was a way to solve a lot of "legacy problems" but forcing a move to a new paradigm. So far, everyone loves it (with a few exceptions!)
I never used Salt, but I've used Puppet extensively. Obviously you know virtualization and containerization existed before Docker and k8s, but you're simplifying and conflating a lot of stuff here.
but the situation could be the reverse: I replace someone who did a pile of k8s and didn't document anything, and while I could orchestrate the whole with libvirt recipes in my sleep, I wouldn't even know how to look up the command
True. In order to know kubernetes, you need to know kubernetes. But the point is at least it's a framework rather than someone you or I came up with and never wrote down. But if you don't know it, you don't know it. The question is, is learning it useful?
In my opinion, operations are always a behemoth of complexity.
Kubernetes allows me to express myself concisely and effectively on the level of complexity I will eventually always encounter in the job, without the particularities of certain tools or manual execution of certain tasks, so for me the complexity is less with Kubernetes.
> The question is, if you have any of these problems, why is it better to get your head into how Kubernetes deals with those operations & tools rather than dealing with well-defined unix tools that specialise in doing these jobs. syslog instead of however Kubernetes gathers the logs, reading FreeIPA docs instead of Kubernetes auth system logs?
The author answers this--learning how to stitch together a bunch of disparate unix tools into a smoothly-operating system requires a whole bunch of professional experience. The author claims that k8s requires the same amount of experience but scales; personally, I think he's being too charitable (or to take a different perspective, he undervalues the difficulty of being a traditional sysadmin). Perhaps he meant "for a single machine". In my opinion, once you extend beyond one machine, k8s is quite a lot simpler.
Can I ask if your nodes are "in the cloud" (ie AWS/rackspace) or in _your_ datacentre?
Can you go touch your boxes? Who can add a new one and how long.
As i understand it k8s is kind of designed for renting a bunch of AWS boxes and just having "my" cluster look like it operates seperately (the traefik router proxy comes to mind as something K8S should do)
Our infrastructure is in house - research group in a University setting. Public cloud would be an absurd option for our needs economically, considering that we run our servers practically 24/7 albeit with convenience of scheduled down times, and that our servers tend to live longer than even 4 years in service. For example, some of our compute nodes after 5 years of service are now servicing users as "console servers" for users doing scientific work in the command line. We run bare-metal servers for performance intensive workloads and databases, and KVM based virtual machines for other services.
> Can you go touch your boxes? Who can add a new one and how long.
This is surprisingly not that long. I can order new machines and they are delivered for me to use in just 2 weeks. With a little bit of planning, I can squeeze a lot of performance out of them. For example, I can avoid noisy neighbours because I can control what is deployed where physically.
Another benefit of non-cloud is that I can customise the machines for their purpose. I recently built a FreeBSD/ZFS based server and chose a cheaper CPU and a lot of faster RAM to deal with. For DNS servers, queue servers and such I chose a CPU that has a higher single threaded performance at higher clock rates than ones with more cores and slower clock rates.
I see - sadly for my personal projects I can either pay for cloud hosting, or I can pay for filling up our spare room with noisy servers, cat5 cabling and general chaos.
All nodes are bare-metal servers that run CentOS 7, and are configured strictly via Ansible. If a node experiences a hardware failure, we just pick another spare server and run our Ansible playbook on it + a script for infrastructure changes like DNS & DHCP.
Our workload is not strictly attached to physical resource. So, we have our updates scheduled for every week, with the condition that updates run when there are no user tasks scheduled on them.
The other nodes that serve a specific purpose - like database servers, app servers etc, get updated regularly for security updates as & when they are available and necessary, and checked for version upgrades (with scheduled downtime) once every 4 months.
> As for the "execute query" example, why is it such a headache ? I just "kubectl ssh <container>" and I am in.
It's slow as hell.
> I had an excellent time working with kubernetes
I'm happy for you, seriously, and I'm not claiming general validity of my experience. What I am disputing is the blog post (which does claim general validity).
The largest reason is cost. My small deployment (3 nodes) is running around $100 / mo on AWS (That's my app, nginx, redis, and postgres).
It doesn't even need 3 nodes, I don't recall if 3 is the minimum, but realistically I only need one (for now). For larger projects this is probably a non-issue.
Second largest reason, is that really I have no idea what is going on on these nodes, and I probably never will. Magically my services run when I have the correct configurations. Not to say that's always a bad thing, but I've found it difficult to determine the default level of security as a result of this.
A third reason is the learning curve. This is less of an issue because I've invested the time to learn already. But like, the first time I tried to get traffic ingressed was painful.
As to what I'm moving to, I migrated one of my websites to a simple rsync + docker-compose setup and am pretty happy with it. In the past, I ran ansible on a 50 node cluster and it worked really well.
I'm not really clear on how moving off Kubernetes saves you money in this scenario. Seems like the most likely source of the cost is the AWS, which is the non-free part. I'm just learning k8s, so I feel you on the learning curve.
Kubernetes has a minimum node count.
Moving to one node, saves cost, by a factor of 3. Not to mention all of the other resources it creates (load balancers).
To play devils advocate, why not use AppEngine or Heroku, that does most of these things and you don't have to manually setup each of the apps with long YAML configs involving resource constraints and restart policies?
I gather it's simply less popular because they're proprietary, opaque and opinionated.
I suspect we're going to continue to see layers built on top of kubernetes, such as gitlabs autodevops, where for the simple things, you don't need to write any of that, but you can still inspect what they create, and jump off the rails for things that are a bit more complex.
Hi, GitLab PM here, thanks for the mention. We're indeed trying to make use of kubernetes inside GitLab as simple as possible. And just to add to your comment, you can try out kubernetes integration (https://docs.gitlab.com/ee/user/project/clusters/) independent of auto devops and vice-versa (https://docs.gitlab.com/ee/topics/autodevops/). Both of these are available in our core offering.
Would it be possible to elaborate on centralized auth with an example? I've done a small amount of playing around with k8s but I'd not heard of this specific use case.
I really think this kubernetes is complex meme needs to die.
I mean, sure, kubernetes _is_ complex. But it's complex in the way that using Rails is more complex than pure Ruby. It's fine when you're playing, or you have a weird subset of problems where you don't care if someone runs or not, but as soon as you deal with the problems of HA, DR, live deployments, secret management, sharing servers, working in a team, etc then if you're solving these issues with ansible and terraform, you're probably just inventing something as complex, and worse, bespoke to your problem.
At the end of the day, once you've learned to use kube, it's probably no worse than any other configuration as code based system, and the question is which of us should be learning it and when, and which of us should just be using higher levels of abstractions like gitlabs autodevops.
Now, indeed, if you're just hacking and you have no need for a CI and CD pipeline, or reproducibility, etc, then sure, it's probably not the time to learn a hosting framework, but once you do know it, you're free to make an informed opinion on your problem.
Personally, I sell kube a lot, but I tend to work for orgs that have several development teams and several apps, and bringing them to a platform seems like a good idea. The question is, should I also put a startup on GKE, or should I do something bespoke where they can't hire another contractor to take over as easily? Personally, I'd go GKE.
Managed k8s (e.g. GKE) and unmanaged k8s (e.g. kops/kubeadm etc) aren't really the same thing in terms of complexity.
GKE takes a lot of the load off by making it Google's problem. The possible downside (which may or may not matter depending on use case) is that you're tied to what Google's control plane provides.
So if you want a shiney admission controller, you're out of luck unless they support it.
But for basic workloads, it's likely to work well and their management solves a lot of vanilla k8s pain points.
This. It's perfectly fine to use GKE or AWS. Trying to set up your own cloud provider for a small project? Crazy.
The amount of hours you have to spend learning how k8s works just to set it up properly from scratch will far exceed the r&d and operational cost of a less sophisticated small project set up from scratch.
Pay for managed services, or do it as simple as possible.
Hello, Community Advocate from GitLab here. Thanks for mentioning our Auto DevOps.
One of it's main functionalities is to eliminate the complexities of getting going with automated software delivery by automatically setting up the pipeline and necessary integrations, freeing up yourself to focus on the culture part. That means everyone can skip the manual work of configuration, and focus on the creative and human aspects of software creation.
You also give us a kubernetes chart that makes it a 5 minute task to install your own gitlab.
Jobs actually run in their own kubernetes pods, and since this is seperate to the master, it helps go a long way to actually having something that resembles a secure SCM, CI and CD.
I don't work for for gitlab, just someone who steals their hard work and sells it to clients. :)
> This find | xargs mawk | mawk pipeline gets us down to a runtime of about 12 seconds, or about 270MB/sec, which is around 235 times faster than the Hadoop implementation.
Pipelines of gunzip, find, grep, xargs, awk etc. on RAID disks... good memories. Analyzed terabytes of data with that. Hard to beat because of the zero setup time.
If you have one customer who need it once a week, you add this find-grep-awk script to xinetd and set the PHP page with the couple of fields setting the arguments for the request.
If you have a million of customers per hour, you setup a bunch of terabyte-RAM servers with realtime Scala pipeline, and hire a team to tweak it day and night.
Because that's a two very different problems. And the worst thing to do is to try to solve the problem X with a tools for problem Y.
Pipe it to a websocket, or curl to some update-account API. Or a mysql/psql/whatever CLI in CSV upload mode so you don't have to worry about injection.
If you want to batch on more than lines, use sponge, or write another few lines of mawk/perl/whatever.
Those are limited examples, and may not always be The Right Way (tm), but there are certainly easy, old, simple ways to take shell pipelines and make the data available quickly in some other store.
That's actually very simple, and there are many ways to do that. If the nature of the task you deal with allows this kind of workflow, it's really worth considering. These days I would use a more proper language like D as a wrapper rather than Bash itself, fore greater flexibility.
IMHO the point for running Kubernetes for personal projects would be to share the same virtual/physical machines between multiple projects while still keeping things manageable.
Over the years you easily end up with a pile of these projects, which want to keep running mostly for fun and therefore want to minimize the associated fixed costs. Using containers may help in keeping the herd manageable so that you can, for example, move everything to some other hosting provider.
Precisely. I use Docker for my personal projects whenever possible, because it keeps the host tidy and avoids conflicts. If I want to play with something new, I can just throw it onto an existing server and start playing, rather than deploying a new VM specifically for it. When I get bored, it's a 2 minute job to trash the container.
Not that the author is right, but he does address exactly these criticims. Besides, that someone used Docker inappropriately isn't evidence that Docker is the wrong tool for personal projects (again, it may be true that Docker is the wrong tool for personal projects, but that is not evidenced by your bio scientist friend scenario).
As someone who's just starting to learn kubernetes (but has a strong background in virtualization and multi-thousand-node environments), I was inclined to agree at first but now see it as more of a modern "home lab" type person. I don't see it as doing it to be a fake-Googler, but more to learn the scalable way of doing things.
Or, as worst, the shortcut to understanding the underlying tech. But, isn't that the point of abstraction?
same here, you really have to look at your own needs. i was thinking about using something like that, but at the end of the day i would waste time and energy and complicate my setup a lot for no benefits. We are simply not at that scale, and dont have those problems, and as much as i would like to play with new technology (and i do that privately) cost/benefit calculation for us is not working right now.
> They're Google wannabies that thinks in Google's scale
and the people who, when they show up at their Google interview, nail the part about scaling. Not that Google ever asked me how to scale a project, but it doesn't hurt learning things just for the sake of learning.
> people who use postgres understand query performance more than people who use mysql
This statement strikes me as likely to be true, at least for database non-experts.
It was certainly true for me. I didn't have anything approaching a decent understanding of query performance until I discovered posgresql's EXPLAIN and, perhaps more importantly, EXPLAIN ANALYZE.
I know a bio scientist who spent two months working on containers and Docker and what not, for what are essentially independent shell scripts that are best run as batch jobs. I spoke with him in length, and he realized at the end of the day that what he really needed was better understanding of standard *nix processes and not containers...