Hacker News new | past | comments | ask | show | jobs | submit login
Kubernetes Is a Surprisingly Affordable Platform for Personal Projects (doxsey.net)
652 points by cdoxsey on Oct 3, 2018 | hide | past | favorite | 329 comments

The list of things mentioned in the article to do and learn for a simple, personal project with k8s is absolutely staggering, in my opinion.

Having used it, there's a sizeable amount of further work needed which the article doesn't mention (e.g. learning how to use the pretty confusing google interface, finding the right logs and using their tools). So the overhead is really huge.

Furthermore, the whole system is slow. Want to run a SQL query against your postgres? You need to use a google cloud command that changes the firewall and ssh's you in on the machine... and this takes a couple of minutes, just enough to make me desist unless I _really_ need to run that query. Abysmal.

Finally, and this is a pet peeve against many advocacy blog posts, they just show you the happy path! Sure, _in the best of cases_ you just edit a file. In a more realistic case, you'll be stuck with a remote management system which is incredibly rich but also a really steep learning curve. Your setup is not performant? Good luck. Need to tweak or fine tune? Again, best of luck.

We've tried to adopt k8s 3-4 times at work and every single time productivity dropped significantly without having significant benefits over normal provisioning of machines. {Edit: this does not mean k8s is bad, but rather that we are probably not the right use case for it!}

...which in turn is usually significantly slower than building your own home server (but that's another story!)

This. Time and again. The number of people who adopt complicated stuff like Kubernetes for what is essentially a couple of web servers and a database is too high. They're Google wannabies that thinks in Google's scale but forget that it is utterly unnecessary in their case.

I know a bio scientist who spent two months working on containers and Docker and what not, for what are essentially independent shell scripts that are best run as batch jobs. I spoke with him in length, and he realized at the end of the day that what he really needed was better understanding of standard *nix processes and not containers...

I had an excellent time working with kubernetes and I am practically a one person company. Kubernetes frees my mind from so many things that now I hate to work without it. Couple of those things include:

- automated ssl

- centralized logging

- super easy scaling up and down (going from 1 instance to 2 instance is a headache manually)

- centralized auth (have a service which doesnt have built in auth?)

- super easy recovery (containers recreated/volumes attached automatically/apps just work unless you have db type of application which shouldn't be just restarted, which is rare)

- Smooth CI/CD (gogs/jenkins/private docker image repository)

As for the "execute query" example, why is it such a headache ? I just "kubectl ssh <container>" and I am in.

> I know a bio scientist who spent

Super obscure example. Kubernetes is definitely not for non-tech people. And I didn't pick k8s overnight. Spent a year doing playful POCs before deploying in a production environment.

If the only thing thats stopping you from using k8s is the learning curve, I suggest you to go ahead and learn it. Its a huge boon.

Thanks for sharing your experience.

I question myself whether it is necessary to hide my operational needs behind a behemoth of complexity like Kubernetes. The list of conveniences you mentioned sounds like magic you get from Kubernetes. What if there is a problem with any of them?

Missing logs?

Inappropriate scaling?

Auth failures? or worse, failure of Auth system?

Easy recovery? what if there were failures to checkpoint/backup/snapshot containers?

CI/CD is good regardless of whether you use Kubernetes or not.

EDIT: The question is, if you have any of these problems, why is it better to get your head into how Kubernetes deals with those operations & tools rather than dealing with well-defined unix tools that specialise in doing these jobs. syslog instead of however Kubernetes gathers the logs, reading FreeIPA docs instead of Kubernetes auth system logs?

My point is, that to deal with all of the conveniences you mentioned, you need to know their details anyway. Why rely on Kubernetes abstraction if there is no such need? (I'm not trying to being snarky. I'm genuinely curious why you think it is a good idea. If you convince me otherwise, perhaps I would start adopting Kubernetes as well.)

I run my cluster(I'm a sysadmin) with

a couple of OpenBSD server that runs redundant DNS and DHCP.

a CentOS 7 box that runs FreeIPA as central Auth.

an OpenBSD server that acts as public facing SSH server.

about 20 nodes, all provisioned using kickstart files, and then configured using Ansible. They run databases, web servers, batch jobs, Git, etc.

A single server that runs ELK stack for log analysis.

A single server that runs Prometheus for quick aggregated monitoring.

Do you think I should switch over to Kubernetes for any benefits?

Sounds like what you have works, and Kubernetes might well not benefit you. With roughly 20 nodes, you have more or less 20 "pets" in devops speak and that sounds like an entirely sensible way to manage them. Contrasting with my problem...

I'm a sysadmin who manages thousands of bare metal machines (A touch less than 10,000 Linux boxes). We have gotten to a point in some areas where you can't linearly scale out the app operations teams by hiring more employees so we started looking at container orchestration systems some time ago (I started working on Mesos about 3 years ago before Kubernetes was any good). As a result, I got to understand the ecosystem and set of tools / methodology fairly well. Kelsey Hightower convinced me to switch from Mesos to Kubernetes in the hallway at the Monitorama conference a few years back. Best possible decision I could have made in hindsight.

Kubernetes can't run all of our applications, but it solves a huge class of problems we were having in other areas. Simply moving from a large set of statically provisioned services to simple service discovery is life changing for a lot of teams. Especially when they're struggling to accurately change 200 configs when a node a critical service was running has a cpu fault and panics + reboots. Could we have done this without kubernetes? Sure, but we wanted to just get the teams thinking about better ways to solve their problems that involved automation vs more manual messing around. Centralized logging? Already have that. Failure of an Auth system? No different than without Kubernetes, you can use sssd to cache LDAP / Kerberos locally. Missing logs? No different than without kubernetes, etc. For us, Kubernetes solves a LOT of our headaches. We can come up with a nice templated "pattern" for continuous delivery of a service and give that template to less technical teams, who find it wonderful. Oh, and we run it bare metal on premise. It wasn't a decision we took lightly, but having used k8s in production for about 9 months, it was the right one for us.

Sure, but the context here is

> Kubernetes Is a Surprisingly Affordable Platform for Personal Projects

with a counter that

> They're Google wannabies that thinks in Google's scale but forget that it is utterly unnecessary in their case.

I would posit that at the point you have over a hundred (or a couple hundred) servers, that "Google wannabies" applies much less and you have reason to use the Kubernetes features. But I wouldn't expect most personal projects to get anywhere near that threshold.

Hell, I bet the vast majority of personal projects happily sit on one server that does all their needs and they have room to grow on that server, or a larger instance of the one server. Possibly a second one spun up occasionally for specialized processing needs until it's destroyed.

I won't use the term 'vast majority' to stay conservative, but the many, many of enterprise projects would happily work on one server (well, let's make it two identical servers, for redundancy and HA). You can get 2U server with 1,5 TB of RAM, dozens of NVMe drives and tens of cores for really cheap nowadays.

And in this, we're entirely in agreement!

I run my personal web stuff in a docker container per application (running as a dedicated user per app via docker's --user=username:group) which iptables rules per user. Kubernetes would work, but is overkill for the 5 vhosts I run via a pretty stripped down Apache config.

> thousands of bare metal machines (A touch less than 10,000 Linux boxes)

This terminology is confusing to me as someone who's worked in the embedded space. In that field bare-metal implies not running an operating system. So does bare-metal Linux box mean you rent a box and stick a distro on it yourself? I feel like there could be more precise terminology used in that case...

Bare-metal in this context means that you have physical hardware and you're responsible for making sure the system can boot and do the stuff you want, as opposed to what you'd have with a service like Amazon's EC2, where you're given a set of apps to configure and execute a virtual machine image. The distinction is made because the former scenario requires extra work for initial configuration (in terms of OS installation and physical networking and such) and you have the burden of setting up automation to handle scenarios where your OS installation is hosed, and much more.

Bare metal in this context typically means your servers are actual physical servers, not virtual machines.

As others have stated below, in this context you have bare metal and you have the cloud. You could also add in virtul machines, which run either on premise or in the cloud.

Thousands of physical servers our company manages that aren't rented, but owned by us. Does that help?

Yes. Thank you! If only industry marketing hadn't coined the terms, we might have: cloud -> provisioned server, bare-metal -> managed server or some other less context dependent terms.

Just for the curiosity, "managed server" in this area is already taken, and means that you get a sysadmin with the rented server (the server is managed for you, including installation and maintenance of the software you need on it). It is "higher level" than cloud servers, not "lower level" ;)

It depends on what your operational needs are.

Do you need to do rolling deployments? Boom. Kubernetes pays for itself right there. And that's just tip of the iceberg. Do you need someone else to join you on ops duties? Boom. If they know Kubernetes, they can easily find their way around your system. Your bespoke system? They're trying to grok your lack of documentation.

It's a latency/bandwidth thing. The learning curve is the latency. The advantages it brings to the table is the bandwidth. As such, I think it's a great investment even for small businesses/teams, but it's not a good investment for a solo developer who doesn't know it, unless they want to be learning Kubernetes rather than building their project.

I think this attitude is part of the problem.

> Do you need to do rolling deployments? Boom. Kubernetes pays for itself right there.m

No, you don’t need to do rolling deployments.

> Do you need someone else to join you on ops duties? Boom. If they know Kubernetes, they can easily find their way around your system.

Easier than ssh’ing in and running ps?

Don’t get me wrong, k8 is great if you need to manage a complex system. It’s just that most of the time you can choose to not build a complex system in the first place.

> Easier than ssh’ing in and running ps?

Have you ever been thrown into an unknown environment? SSH'ing where exactly? Oh but this runs at hosting partner X? That runs at hosting partner Y? Oh but this service is actually behind a reverse proxy running on box Z. Oh you need to RDP to a jump-server to SSH to those machines? Documentation? You mean these outdated Word documents and Excel sheets? And down the rabbit-hole you go. Fun!

And don't say that doesn't happen, I'm a freelance sysadmin - and am exactly in an environment like that right now, it's not the first, and won't be the last. To get a complete picture here, I needed 2 whole months for everything outside the Openshift clusters.

The stuff running on openshift was pretty simple, there is the master, here are your credentials - and that's it. The rest is pretty easy to figure out if you've worked with k8s/openshift. The biggest problem was knowing if the app in question ran on Openshift or somewhere else.

Yes, you’re describing pretty much every environment I’ve ever worked in.

I’m not arguing that k8s is the wrong tool to manage that kind of complexity. I’m arguing that, in almost all cases, that kind of complexity is completely unwarranted.

Love that latency bandwidth explanation. Going to steal it for other contexts.

Do it! I use the latency/bandwidth thing as an intellectual model for all sorts of surprising situations. It's deeply embedded in my frame of reference for looking at the world, especially for doing any sort of work.

In a way, it turns all sorts of things into simple algebra - ax + b. So time = scope / bandwidth + latency. If latency dominates, it takes longer. If bandwidth dominates, goes faster.

amazing analogy

> The question is, if you have any of these problems, why is it better to get your head into how Kubernetes deals with those operations & tools rather than dealing with well-defined unix tools that specialise in doing these jobs. syslog instead of however Kubernetes gathers the logs, reading FreeIPA docs instead of Kubernetes auth system logs?

A unified interface for things like "find where this service is running". One standardised way to do it - even if Kubernetes were nothing more than a "install log system x and auth service y and dns server z" there would be a lot of value in that. You talk about "well-defined unix tools" but IME the unix tools are a lot less well-defined than container tooling - e.g. there are several different log systems that describe themselves as "syslog-compatible" but it's not at all clear what that means and not every application that claims to work with "syslog" will work with every service that claims to be syslog-compatible.

> about 20 nodes, all provisioned using kickstart files, and then configured using Ansible. They run databases, web servers, batch jobs, Git, etc.

Did you face the same questions from other people when you adopted Kickstart? What were your answers then?

The setup you've described is probably 80% of the way to Kubenetes compared to a traditional sysadmin approach. Kubernetes will save you a bit of work in terms of eliminating your "script for infrastructure changes like DNS & DHCP" (which I would suspect is one of the more error-prone parts of your setup? It tends to be, IME) and the need to manually allocate services to classes of hosts (ansible playbooks tend to end up as "we want 3 hosts in this class that run x, z and w, and 2 hosts in this class that runs v and y", whereas kubernetes is more "here's a pool of hardware, service x requires 2GB of RAM and needs to run on 3 hosts, you sort it out"). Whether that's worth the cost of migrating what sounds like a fairly recently updated infrastructure setup is another question though.

Cool. You have something bespoke. It works for you, but it's going to include a lot of toil when someone replaces you.

Now personally, I'd rather kubectl get pods --all-namespaces, figure out where the log collector is, what's wrong with it, and fix it, but instead I'm probably going to be reading your docs and trying to figure out where these things are by the time I've fixed it on a kube cluster.

I'm not sure I understand. Sorry, my exposure to Kubernetes is only a few days and is limited to an overview of all of its components and a workshop by Google.

> It works for you, but it's going to include a lot of toil when someone replaces you.

I was thinking that Ansible + FreeIPA(RedHat enterprise product) + Elastic logging setup(ELK) + Prometheus would be easier for my successor to deal with than figuring out my bespoke setup of Kubernetes(which keeps adding new features every so often). Even if I did not create proper docs(I do my best to have good doc of whatever I do), my sucessor would be better off relying on RedHat's specific documentation rather than guessing what I did to a Kubernetes version from 6 months ago...

If something breaks in FreeIPA or Unbound(DNS) or Ansible, it is very much easier to ask targeted questions on StackOverflow or lookup their appropriate manuals. They don't change as often as Kubernetes does. Don't you think?

Alternatively, if something breaks on Kubernetes, you'd have to start digging Kubernetes implementation of whatever feature it is, and hope that the main product hasn't moved on to the next updated release.

Is it not the case? Is Kubernetes standard enough that their manuals are RedHat quality and is there always a direct way of figuring out what is wrong or what the configuration options are?

Here I was thinking that my successor would hate me if I built a bespoke Kubernetes cluster rather than standard enterprise components such as the ones I listed above.

> I'm not sure I understand. Sorry, my exposure to Kubernetes is only a few days and is limited to an overview of all of its components and a workshop by Google.

No worries. The fact you asked the question is a positive, even if we end of agreeing to disagree.

> I was thinking that Ansible + FreeIPA(RedHat enterprise product) + Elastic logging setup(ELK) + Prometheus would be easier for my successor to deal with than figuring out my bespoke setup of Kubernetes(which keeps adding new features every so often). Even if I did not create proper docs(I do my best to have good doc of whatever I do), my sucessor would be better off relying on RedHat's specific documentation rather than guessing what I did to a Kubernetes version from 6 months ago...

So both FreeIPA and ELK would be things we would install onto a kube cluster, which is rather what I was commenting about. When either peice of software has issues on a kubernetes cluster, I can trivially use kubernetes to find, exec and repair these. I know how they run (they're kubernetes pods) and I can see the spec of how they run based on the kubernetes manifest and Dockerfile. I know where to look for these things, because everything runs in the same way in kubernetes. If you've used an upstream chart, such as in the case of prometheus, even better.

For things that aren't trivial we still need to learn how the software works. All kubernetes is solving is me figuring out how you've hosted these things, which can be done either well or poorly, and documented either well or poorly, but with kube, it's largely just an API object you can look at.

> Is it not the case? Is Kubernetes standard enough that their manuals are RedHat quality and is there always a direct way of figuring out what is wrong or what the configuration options are?

Redhat sells kubernetes. They call it openshift. The docs are well written in my opinion.

Bigger picture is if you're running a kubernetes cluster, you run the kubernetes cluster. You should be an expert in this, much the same way you need to be an expert in chef and puppet. This isn't the useful part of the stack, the useful part is running apps. This is where kubernetes makes things easier. Assuming your bespoke kubernetes itself is a different thing. Use a Managed Service if you're a small org, and a popular/standard solution if you're building it yourself.

Thanks for the patient response.

Reading through your response already showed that at-least some of my understanding of Kubernetes was wrong and that I need to look into it further. I was assuming that Kubernetes would encompass the auth provider, logging provider and such. Honestly, it drew a parallel to systemd in my mind, trying to be this "I do everything" mess. The one day workshop I attended at Google gave me that impression as it involved setting up an api server, ingress controller, logging container(something to do with StackDriver that Google had internally), and more for running a hello-world application. That formed my opinion that it was too many moving parts than necessary.

If there is a minimal abstraction of Kubernetes, that just orchestrates the operation of my standard components(FreeIPA, nginx, Postgres, Git, batch compute nodes), then it is different than what I saw it to be.

> if you're running a kubernetes cluster, you run the kubernetes cluster. You should be an expert in this, much the same way you need to be an expert in chef and puppet.

I think that is the key. End of the day, it becomes a value proposition. If I run all my components manually, I need to babysit them in operation. Kubernetes could take care of some of the babysitting, but the rest of times, I need to be a Kubernetes expert to babysit Kubernetes itself. I need to decide whether the convenience of running everything as containers from yaml files is worth the complexity of becoming an expert at Kubernetes, and the added moving parts(api server, etcd etc.).

I will play with Kubernetes in my spare time on spare machines to make such a call myself. Thanks for sharing your experience.

> That formed my opinion that it was too many moving parts than necessary.

Ingress controller and log shippers are pluggable. I'd say most stacks are going to want both, so it makes sense for a MSP to provide these, but you can roll your own. You roll your own, and install the upstream components the same way you run anything on the cluster. Basically, you're dogfooding everything once you get past the basic distributed scheduler components.

> I think that is the key. End of the day, it becomes a value proposition. If I run all my components manually, I need to babysit them in operation. Kubernetes could take care of some of the babysitting, but the rest of times, I need to be a Kubernetes expert to babysit Kubernetes itself.

So it depends what you do. Most of us have an end goal of delivering some sort of product and thus we don't really need to run the clusters ourselves. Much the same way we don't run the underlying AWS components.

Personally, I run them and find them quite reliable, so I get babysitting at a low cost, but I also do know how to debug a cluster, and how to use one as a developer. I don't think running the clusters are for everyone, but if running apps is what you do for a living, a solution like this is most for all levels of stacks. Once you have the problem of running all sorts of apps for various dev teams, a platform will really shine, and once you have the knowledge, you probably won't go back to gluing solutions together

>RedHat quality



True. I cannot generalise like that. I was only thinking about RHEL and IDM(RH version of FreeIPA) - those documentation as super thorough and very helpful IMO.

https://access.redhat.com/documentation/en-us/openshift_cont... :)

Openshift is something Redhat is pushing heavily, and the documentation is extremely good. Also, several of the openshift devs frequently comment here on HN.

It’s very generous of you to assume the guy before you documented everything.

Indeed, that's part of the point. I first started using kube at a UK Gov Org about 4 years ago. All the units of the org has different standards for hosting, they weren't written down, and the vendors would quote unreasonble sums of money knowing that their systems weren't really supportable by anyone without specific experience of the system.

Kube was used to enforce standards, and make things supportable by a wider number of people, which is a weird thing to say since barely anyone used it at the time.

This was a large success though, as we only really needed to train people to use kubernetes once, and then it was over to worrying about the actual applications.

That's not so bespoke, that's a very much standard software, services and operating systems, which is very familiar to every UNIX sysadmin. It would be much easier to pass that system to new sysadmin than Kubernetes cluster. By far.

It's not that bespoke, but it is a bunch of building blocks glued together, rather than being a framework, and thus one needs to figure out which blocks you've glued together.

Your average guy with unix skills that works on small deploys probably doesn't want to learn kubernetes, once they work on larger deployments, they tend to convert in my experience.

I'm just learning Kubernetes now, but I've managed multi-thousand node VMware deployments, and a lot of these arguments seem to boil down to your entry point.

If your entry point is knowing Kubernetes really well but not knowing traditional stacks as well, the Kubernetes stack will probably make a lot more sense to you. Naturally, if you've never touched k8s, it will be a lot more confusing to troubleshoot a multi-tier app cluster on it than to troubleshoot on a traditional setup, even if you have to read the docs on the former first.

Obviously the abstraction has a lot of advantages; there are reasons people are moving towards it, but as in the case of this blogpost, I think a lot of people are using it as an entry point/shortcut for 'learning' systems that are more complex than they realize. That's not a bad thing; again, abstraction is great for many things. Hopefully, with time, they'll learn to troubleshoot the underpinnings as they get deeper and deeper into the tech.

The thing you know will always seem simpler than the thing you don't, but I think it's simply a convention versus configuration argument and where you want the abstactions.

Personally, I've used both, but many of the traditional stacks are manually configured blobs with abstactions in the wrong place and I'm happy that your average sysadmin no longer needs to build these.

This ^

As a former maintainer of the saltstack config management software (and still occasional contributor): https://github.com/saltstack/salt/commits?author=SEJeff

I find that kubernetes is a better fit for a lot of the applications which previously had dedicated hardware that ran at 0.00% utilization all day. It is also a much better user experience for developers as it was previously along the lines of:

1. User contacts linux team and asks for a server to run say a webapp but it needs redis

2. Linux team sees the ticket and realizes there is no hardware to run dedicated app, so they ask the Datacenter team to get a server.

3. There are no unused servers in the warehouse, so the Datacenter team gets purchasing to send a PO to the server vendor, and in a week we have a new server.

4. The Datacenter team racks the server, and goes back and forth with the Linux team until the network config is correct from a physical standpoint.

5. The Linux team builds out the box and configures the services the user requests, with config management and some fancy tunables for that user.

6. The user gets the server only to find a few things are misconfigured they don't have access to change and go back to step 5.

7. The app works, but the user has to ask the Linux team to update internal DNS for their new service (adding a CNAME generally).

The process now involves:

1. User creates container images of their apps and pushes to our internal registry.

2. User looks at the copious example application templates and makes their app run in kubernetes.

4. Their app runs, but if they have questions, the Linux team will help them with specific requirements. They get automatic dns via coredns, which runs in every kubernetes cluster and creates dns for each service that runs.

5. They spend more time prototyping new software than they do worrying about deployment.

Provisioning hardware still matters depending on requirements. I think you oversimplified a bit here. The user in your scenario probably needs to talk to the devops team and discuss expected resource needs unless it's really a tiny webapp that could have run on a shared server in the past anyway or in a VM.

K8s definitively helps with better resource usage and it can make deployment much more straightforward but it doesn't abstract the need to think of capacity and provision hardware.

We have LimitRange and ResourceQuotas setup per namespace in kubernetes. Each team talks with us about expected needs before we permission them with the ability to login to kubernetes and create a namespace for them to begin with. If a team needs more of either, we can up them. We have grafana dashboards of the cluster utilization and will proactively order extra servers to add capacity as needed. So far, so good!

I'm not pretending I'm not conflating things in this example, but for us, it was a way to solve a lot of "legacy problems" but forcing a move to a new paradigm. So far, everyone loves it (with a few exceptions!)

I never used Salt, but I've used Puppet extensively. Obviously you know virtualization and containerization existed before Docker and k8s, but you're simplifying and conflating a lot of stuff here.

The last process should should be numbered 1-4 (you skipped 3)?

Take your damned upvote! You're right, but I can't edit it now to fix.

but the situation could be the reverse: I replace someone who did a pile of k8s and didn't document anything, and while I could orchestrate the whole with libvirt recipes in my sleep, I wouldn't even know how to look up the command

> kubectl get pods --all-namespaces

so familiarity is not a good argument here

True. In order to know kubernetes, you need to know kubernetes. But the point is at least it's a framework rather than someone you or I came up with and never wrote down. But if you don't know it, you don't know it. The question is, is learning it useful?

> The question is, is learning it useful?

Even that question is too broadly posed. The question is, is learning it in addition to[1] the previously-standard tools useful?

[1] In the context of the article, this might be "instead of", which presents a much lower bar.

I don't see how libvert is easier to learn than kubernetes.

Neither one is obvious to a sysadmin who hasn't worked in either world.

I meant that familiarity with a framework isn’t an argument. Libvirt isn’t doing exactly the same thing as k8s and it isn’t easier or harder

In my opinion, operations are always a behemoth of complexity.

Kubernetes allows me to express myself concisely and effectively on the level of complexity I will eventually always encounter in the job, without the particularities of certain tools or manual execution of certain tasks, so for me the complexity is less with Kubernetes.

> The question is, if you have any of these problems, why is it better to get your head into how Kubernetes deals with those operations & tools rather than dealing with well-defined unix tools that specialise in doing these jobs. syslog instead of however Kubernetes gathers the logs, reading FreeIPA docs instead of Kubernetes auth system logs?

The author answers this--learning how to stitch together a bunch of disparate unix tools into a smoothly-operating system requires a whole bunch of professional experience. The author claims that k8s requires the same amount of experience but scales; personally, I think he's being too charitable (or to take a different perspective, he undervalues the difficulty of being a traditional sysadmin). Perhaps he meant "for a single machine". In my opinion, once you extend beyond one machine, k8s is quite a lot simpler.

Can I ask if your nodes are "in the cloud" (ie AWS/rackspace) or in _your_ datacentre?

Can you go touch your boxes? Who can add a new one and how long.

As i understand it k8s is kind of designed for renting a bunch of AWS boxes and just having "my" cluster look like it operates seperately (the traefik router proxy comes to mind as something K8S should do)

I speak as a complete K8s novice.

Our infrastructure is in house - research group in a University setting. Public cloud would be an absurd option for our needs economically, considering that we run our servers practically 24/7 albeit with convenience of scheduled down times, and that our servers tend to live longer than even 4 years in service. For example, some of our compute nodes after 5 years of service are now servicing users as "console servers" for users doing scientific work in the command line. We run bare-metal servers for performance intensive workloads and databases, and KVM based virtual machines for other services.

> Can you go touch your boxes? Who can add a new one and how long.

This is surprisingly not that long. I can order new machines and they are delivered for me to use in just 2 weeks. With a little bit of planning, I can squeeze a lot of performance out of them. For example, I can avoid noisy neighbours because I can control what is deployed where physically.

Another benefit of non-cloud is that I can customise the machines for their purpose. I recently built a FreeBSD/ZFS based server and chose a cheaper CPU and a lot of faster RAM to deal with. For DNS servers, queue servers and such I chose a CPU that has a higher single threaded performance at higher clock rates than ones with more cores and slower clock rates.

I see - sadly for my personal projects I can either pay for cloud hosting, or I can pay for filling up our spare room with noisy servers, cat5 cabling and general chaos.

I know which price is more expensive :-)

how often do you update your nodes?!

All nodes are bare-metal servers that run CentOS 7, and are configured strictly via Ansible. If a node experiences a hardware failure, we just pick another spare server and run our Ansible playbook on it + a script for infrastructure changes like DNS & DHCP.

Our workload is not strictly attached to physical resource. So, we have our updates scheduled for every week, with the condition that updates run when there are no user tasks scheduled on them.

The other nodes that serve a specific purpose - like database servers, app servers etc, get updated regularly for security updates as & when they are available and necessary, and checked for version upgrades (with scheduled downtime) once every 4 months.

It sounds like the prior commenter Ansiblized their environment, they likely update the 20+ machines often since its not much of an effort.

> As for the "execute query" example, why is it such a headache ? I just "kubectl ssh <container>" and I am in.

It's slow as hell.

> I had an excellent time working with kubernetes

I'm happy for you, seriously, and I'm not claiming general validity of my experience. What I am disputing is the blog post (which does claim general validity).

The learning curve is sharp. The amount of things happening you don't have awareness of is also worrisome (to me).

Source: I am also using Kubernetes in production, migrating off it soon.

It would be interesting if you could briefly say your reasons for moving away from K8s and what you're moving to.


The largest reason is cost. My small deployment (3 nodes) is running around $100 / mo on AWS (That's my app, nginx, redis, and postgres).

It doesn't even need 3 nodes, I don't recall if 3 is the minimum, but realistically I only need one (for now). For larger projects this is probably a non-issue.

Second largest reason, is that really I have no idea what is going on on these nodes, and I probably never will. Magically my services run when I have the correct configurations. Not to say that's always a bad thing, but I've found it difficult to determine the default level of security as a result of this.

A third reason is the learning curve. This is less of an issue because I've invested the time to learn already. But like, the first time I tried to get traffic ingressed was painful.

As to what I'm moving to, I migrated one of my websites to a simple rsync + docker-compose setup and am pretty happy with it. In the past, I ran ansible on a 50 node cluster and it worked really well.

I'm not really clear on how moving off Kubernetes saves you money in this scenario. Seems like the most likely source of the cost is the AWS, which is the non-free part. I'm just learning k8s, so I feel you on the learning curve.

Kubernetes has a minimum node count. Moving to one node, saves cost, by a factor of 3. Not to mention all of the other resources it creates (load balancers).

To play devils advocate, why not use AppEngine or Heroku, that does most of these things and you don't have to manually setup each of the apps with long YAML configs involving resource constraints and restart policies?

I gather it's simply less popular because they're proprietary, opaque and opinionated.

I suspect we're going to continue to see layers built on top of kubernetes, such as gitlabs autodevops, where for the simple things, you don't need to write any of that, but you can still inspect what they create, and jump off the rails for things that are a bit more complex.

Hi, GitLab PM here, thanks for the mention. We're indeed trying to make use of kubernetes inside GitLab as simple as possible. And just to add to your comment, you can try out kubernetes integration (https://docs.gitlab.com/ee/user/project/clusters/) independent of auto devops and vice-versa (https://docs.gitlab.com/ee/topics/autodevops/). Both of these are available in our core offering.

Honestly, this is why so many people just use Heroku.

> I am practically a one person company

> Spent a year

> go ahead and learn it. Its a huge boon.

for green field deployment with a 1 year R&D budget for devops, sure K8s is great choice perhaps, but for the rest of us (even in tech)?

Would it be possible to elaborate on centralized auth with an example? I've done a small amount of playing around with k8s but I'd not heard of this specific use case.

I really think this kubernetes is complex meme needs to die.

I mean, sure, kubernetes _is_ complex. But it's complex in the way that using Rails is more complex than pure Ruby. It's fine when you're playing, or you have a weird subset of problems where you don't care if someone runs or not, but as soon as you deal with the problems of HA, DR, live deployments, secret management, sharing servers, working in a team, etc then if you're solving these issues with ansible and terraform, you're probably just inventing something as complex, and worse, bespoke to your problem.

At the end of the day, once you've learned to use kube, it's probably no worse than any other configuration as code based system, and the question is which of us should be learning it and when, and which of us should just be using higher levels of abstractions like gitlabs autodevops.

Now, indeed, if you're just hacking and you have no need for a CI and CD pipeline, or reproducibility, etc, then sure, it's probably not the time to learn a hosting framework, but once you do know it, you're free to make an informed opinion on your problem.

Personally, I sell kube a lot, but I tend to work for orgs that have several development teams and several apps, and bringing them to a platform seems like a good idea. The question is, should I also put a startup on GKE, or should I do something bespoke where they can't hire another contractor to take over as easily? Personally, I'd go GKE.

Managed k8s (e.g. GKE) and unmanaged k8s (e.g. kops/kubeadm etc) aren't really the same thing in terms of complexity.

GKE takes a lot of the load off by making it Google's problem. The possible downside (which may or may not matter depending on use case) is that you're tied to what Google's control plane provides.

So if you want a shiney admission controller, you're out of luck unless they support it.

But for basic workloads, it's likely to work well and their management solves a lot of vanilla k8s pain points.

This. It's perfectly fine to use GKE or AWS. Trying to set up your own cloud provider for a small project? Crazy.

The amount of hours you have to spend learning how k8s works just to set it up properly from scratch will far exceed the r&d and operational cost of a less sophisticated small project set up from scratch.

Pay for managed services, or do it as simple as possible.

Hello, Community Advocate from GitLab here. Thanks for mentioning our Auto DevOps.

One of it's main functionalities is to eliminate the complexities of getting going with automated software delivery by automatically setting up the pipeline and necessary integrations, freeing up yourself to focus on the culture part. That means everyone can skip the manual work of configuration, and focus on the creative and human aspects of software creation.

Here's the doc with more info about it https://docs.gitlab.com/ee/topics/autodevops/

You also give us a kubernetes chart that makes it a 5 minute task to install your own gitlab.

Jobs actually run in their own kubernetes pods, and since this is seperate to the master, it helps go a long way to actually having something that resembles a secure SCM, CI and CD.

I don't work for for gitlab, just someone who steals their hard work and sells it to clients. :)

> This find | xargs mawk | mawk pipeline gets us down to a runtime of about 12 seconds, or about 270MB/sec, which is around 235 times faster than the Hadoop implementation.


Pipelines of gunzip, find, grep, xargs, awk etc. on RAID disks... good memories. Analyzed terabytes of data with that. Hard to beat because of the zero setup time.

And now your customers expect to see new data immediately reflected in their account. What then?

It depends.

If you have one customer who need it once a week, you add this find-grep-awk script to xinetd and set the PHP page with the couple of fields setting the arguments for the request.

If you have a million of customers per hour, you setup a bunch of terabyte-RAM servers with realtime Scala pipeline, and hire a team to tweak it day and night.

Because that's a two very different problems. And the worst thing to do is to try to solve the problem X with a tools for problem Y.

All those tools are stream processors, why wouldn't the customers see the new data immediately reflected in their account?

Spawn as many servers as you wish, spread your data between them, run your scripts, generate reports, show it to the customers.

Pipe it to a websocket, or curl to some update-account API. Or a mysql/psql/whatever CLI in CSV upload mode so you don't have to worry about injection.

If you want to batch on more than lines, use sponge, or write another few lines of mawk/perl/whatever.

Those are limited examples, and may not always be The Right Way (tm), but there are certainly easy, old, simple ways to take shell pipelines and make the data available quickly in some other store.

That's actually very simple, and there are many ways to do that. If the nature of the task you deal with allows this kind of workflow, it's really worth considering. These days I would use a more proper language like D as a wrapper rather than Bash itself, fore greater flexibility.

IMHO the point for running Kubernetes for personal projects would be to share the same virtual/physical machines between multiple projects while still keeping things manageable.

Over the years you easily end up with a pile of these projects, which want to keep running mostly for fun and therefore want to minimize the associated fixed costs. Using containers may help in keeping the herd manageable so that you can, for example, move everything to some other hosting provider.

Precisely. I use Docker for my personal projects whenever possible, because it keeps the host tidy and avoids conflicts. If I want to play with something new, I can just throw it onto an existing server and start playing, rather than deploying a new VM specifically for it. When I get bored, it's a 2 minute job to trash the container.

Not that the author is right, but he does address exactly these criticims. Besides, that someone used Docker inappropriately isn't evidence that Docker is the wrong tool for personal projects (again, it may be true that Docker is the wrong tool for personal projects, but that is not evidenced by your bio scientist friend scenario).

> ... for what is essentially a couple of web servers and a database

Interestingly, while the author compares k8s to an SQL database, they actually do not deploy a DB. It's all fun an giggles until you deploy a DB.

As someone who's just starting to learn kubernetes (but has a strong background in virtualization and multi-thousand-node environments), I was inclined to agree at first but now see it as more of a modern "home lab" type person. I don't see it as doing it to be a fake-Googler, but more to learn the scalable way of doing things.

Or, as worst, the shortcut to understanding the underlying tech. But, isn't that the point of abstraction?

same here, you really have to look at your own needs. i was thinking about using something like that, but at the end of the day i would waste time and energy and complicate my setup a lot for no benefits. We are simply not at that scale, and dont have those problems, and as much as i would like to play with new technology (and i do that privately) cost/benefit calculation for us is not working right now.

> They're Google wannabies that thinks in Google's scale

and the people who, when they show up at their Google interview, nail the part about scaling. Not that Google ever asked me how to scale a project, but it doesn't hurt learning things just for the sake of learning.

Sorry you quoted me in partial context. I said,

> They're Google wannabies that thinks in Google's scale but forget that it is utterly unnecessary in their case.

Meaning they _unnecessarily_ think in Google's scale when there is no need for it.

I see! My bad then :)

> the people who, when they show up at their Google interview, nail the part about scaling

Having interviewed many tenths of engineers at Amazon - no, not really.

Isn't someone who played with e.g. k8s more likely to understand scaling than someone who never cared about it though?

I think the opposite. K8s hides so many things under layers of NAT that you really dont understand what does and doesnt work.

The 5-15 issue is a self inflicted wound k8s brings to the party that is quite interesting to work around


k8s hides away some scalability problems and does not address many others.

Most real scalability issues exist on lower levels and require more theoretical thinking than experimenting with tools.

That's kinda like saying people who use postgres understand query performance more than people who use mysql.

> people who use postgres understand query performance more than people who use mysql

This statement strikes me as likely to be true, at least for database non-experts.

It was certainly true for me. I didn't have anything approaching a decent understanding of query performance until I discovered posgresql's EXPLAIN and, perhaps more importantly, EXPLAIN ANALYZE.

I think it's more like postgres vs csv.

When I got involved with it (and to be fair, most devops things) the value was more in what I was learning for myself, as opposed to what was suitable for the business. Because it's pretty exciting tech and everyone's talking about it, and it doesn't feel like a rabbit hole until you pull yourself back out and see the light again.

So, what happened almost every time is that the business unknowingly pivoted into innovating on PaaS, because Docker/Kubernetes was only the first step. After that, you had to figure out the best approach to CI, how to set up your deploy pipelines across multiple environments, how to replicate what you were used to seeing with Heroku, etc.

And of course the cost balloons at that stage because your cheap 3 node setup ends up exposing half a dozen load balancers that each cost money, so you start to think about hosting Traefik or nginx-ingress-controller so you can manage load balancing yourself and save some money, and because it's so easy to add more services once you know how, you start wanting to self-host all the things.

Meanwhile your startup hasn't even launched to the public yet and the sunk cost fallacy has fully embedded itself in the team mentality: they've just put months of time and effort into setting up the perfect K8S/Docker architecture and workflow, that now requires actual devops experience to maintain, and you can't push back on it because it's all about scaling when things go live, and self-hosting, and how convenient it all is.

Except, you know, that's 3-6 months of runway down the drain because leadership didn't shift the focus back to an MVP and let the fantasy continue. And it would be hard to justify anything like Kubernetes for pushing an MVP out of the door; that's what Heroku and cheap Digital Ocean boxes are for.

> And it would be hard to justify anything like Kubernetes for pushing an MVP out of the door; that's what Heroku and cheap Digital Ocean boxes are for.

Exactly. I was following along with the article until he mentioned that startups are doing this. Using k8s at a startup (especially a startup building an MVP) is just like using microservices. It is a disservice to the business by technologists.

Building the monolith on heroku will get you to market faster.

Yagni. (Until you do, and thrn by all means pay off that technical debt.)

Totally disagree. Our startup use Kubernetes and Google Cloud and shipped an MVP in a few months. No operations people only devs and devops. Kubernetes is not perfectly set up yet and the logging has failed sometimes but we ship code to production many times per day and are really happy about not having to care about infrastructure that much. Three nodes and Cloud SQL (PostgreSQL) goes a long way.

How much money do you save vs using Heroku?

Did you work at my former employer? because that is exactly what happened there. No product launched, full k8s and CI/CD crazyness.

It is sort of funny to hear people decrying architecting things properly from the start (for certain values of properly) vs the usual "do it half-assed and solve technical debt later" method.

I can recommend CaptainDuckDuck for simpler use cases: One-Click-Deployments (e.g. MySQL, PostgreSQL, MongoDB, ...), easy-to-use CLI tool, automatic HTTPS, Docker Swarm support (cluster and scaling), web GUI and open-source.



It made it so much easier to deploy new changes (just `captainduckduck deploy` and done). We also use minio.io (open-source S3) and are extremely productive with those tools.

It's perfect for web agencies. It's not sophisticated enough for k8s' use case, but it's extremely easy to use (you can just add nodes and it will restart containers automatically [1]).

[1]: https://captainduckduck.com/docs/app-scaling-and-cluster.htm...

From https://captainduckduck.com/docs/get-started.html

D) Disable Firewall Some server providers have strict firewall settings. To disable firewall on Ubuntu: ufw disable


Kasra from CaptainDuckDuck is here.

Getting Started section is aimed for beginners. You surely don't have to disable firewalls entirely. There's a section in the docs outlining the ports that are being used by Captain.

This looks interesting. I hadn't heard of CaptainDuckDuck. Have you tried Dokku or Flynn(flynn.io)? I had looked at them and settled for Flynn because it had better support for multiple-host clusters.

I looked at Dokku and Flynn for personal (single host) projects and find both to be too complicated for my needs. I generally prefer deploying Docker containers over Heroku buildpacks.

I ended up just running a single Digital Ocean droplet with Traefik as a loadbalancer/entry point, and then running each of my projects with docker-compose. (And building each project with Docker)

With Traefik I can set up reverse proxying for each project just by adding a few labels to its docker containers, and Traefik manages LetsEncrypt for me.

Flynn seems to have stalled, unfortunately. It was a pretty straightforward setup, and I got apps running on there rather quickly. There's still no real support for letsencrypt, which is kind of a killer.

Dokku has no support for clustering, but it's amazingly simple to set up and start deploying apps. (Digital Ocean has a pre-built and up-to-date image ready to go.) I like it for the heroku-style buildpack deployment, although it also has dockerfile-based deployment. Great little system for low-risk projects.

Tried Dokku in the past, but it was too barebones for me. Flynn looked also good, but it seemed to complicated (more features than we need) and we settled for CDD.

What functionality was Dokku missing for you? It's has quite a few features now, and we're adding new functionality all the time.

Mainly a web interface. I never gave dokku-man a try, but it doesn't look as feature-rich as the web interface for CDD.

Is that something you'd be willing to pay for?

Do you mean theoretically? We already use CaptainDuckDuck. I can imagine to give a bounty for it, but I don't intend to switch anyways.

I mean theoretically, but makes sense.

Agreed 100x. I find it very hard to understand why some people talk themselves into thinking they need a distributed container orchestration and management platform, when they don’t have a container orchestration problem: they just want to run an app or two with a DB under it.

We need to go back and understand the problem we’re trying to solve, and for 99% of smallish companies and projects this is not container orchestration or hardware resource abstraction.

> an app or two with a DB under it

And most often all those app or two are doing is converting JSON to SQL and SQL to JSON actually. And PostgREST would be pretty much enough for that.

You don't need to run the gcloud command to connect to a Cloud SQL instance. You can run the cloud_sql_proxy as a background process on your machine and then you can use mysql or psql to connect to it instantly.

I guess your answer supports the parent's case: the happy path is a charm, but you need to learn a whole lot more to use it effectively. Paraphrasing the tag-line of Mesosphere, it's a new OS, one that is targeted at data centres, not small pet-servers setups.

I've had good experiences using Rancher (https://rancher.com/) to operate k8s clusters. It provides a nice management interface, which handles common use-cases really well. Grabbing per-container logs or getting a shell into a container becomes trivially easy.

Admittedly, operation via Rancher comes with its own complexities: setting up security groups and IAM roles. These are documented of course, but as you say: minimum effort remains non-trivial.

Can't really speak to the cloudsql issue... that's not really kubernetes. Running SQL as a container you can just kubectl port-forward.... but it sounds like a specific issue you ran into?

I'll grant that there can be headaches that I don't really see anymore because I was deeply immersed in it for a while. But running a linux server by itself was hard till I learned that too.

Performance is a subtler point, not sure I follow what you're trying to say there. Container overhead? At least in my experience gke seemed responsive, and deployments settled pretty quick.

Tweaking is usually a strong point of k8s. Just reapply... so I'm probably misunderstanding what you were getting at here.

Kubernetes needs to improve their accessibility to newcomers but the happy path isn't unheard, for simple stateless apps it mostly just works. And when it doesn't I can usually kubectl log and kubectl delete pod my way to a solution and there are docs and stack overflow to help.

Not perfect but I was surprised it worked as well as it did.

> Can't really speak to the cloudsql issue... that's not really kubernetes. Running SQL as a container you can just kubectl port-forward

Best solution is Cloud SQL Proxy: https://cloud.google.com/sql/docs/postgres/sql-proxy

Where I work we adopted DCOS and it has been nothing but a pleasure to use and we have been updating the cluster quite often. Always very smooth and no pain at all.

We thought about k8s a couple of times but it always looked too over complicated and no gain compared to our use of DCOS

You are right in the large.

That said, once you learn all of these k8s concepts and mechanisms, and probably script nearly all of them, they all become like breathing and don't really take much time.

I frankly now am at the point where it is faster to spin up a small project on k8s than it would be to do so with the cloud providers PaaS platform.

For context: I was extremely lucky to get to work with Caleb back in 2012-13. He introduced me to Go and my life has never been the same :D

Properly hosting custom apps on the web in a resilient way isn't easy. This being the case, it seems an error to interpret the thoroughness of the article as a signal that the proposed solution or situation is unreasonably complex.

I'll take detailed and exhaustive over sparse and hand-wavy any day!

Kubernetes certainly has a learning curve, just like most worthwhile things in life.

Hosting personal projects being the task at hand, super high availability isn't usually a headline requirement.

That being said, there are certainly existing solutions for doing that, for cheap or even free tiers with less effort - Heroku, Google App Engine, etc.


It can be rough to manage Kubernetes when you just want to focus on your app. I was having that same problem when building side projects and focused on building a platform to deploy dockerfiles on a managed Kubernetes instance.


Don't forget that K8s changes rapidly so whatever you learned a year ago might be completely useless now. So better keep those people dealing with K8s learning by all means or have fun with broken production.

That is not true, based on my experience. I started to learn and use kubernetes about a year ago (now we are running our production systems on kubernetes) and we are usually on the latest version. I cannot mention anything which I learned about kubernetes and because useless, because something changed.

It's less true now, but a lot of older blog posts and books (at the time, even the most recently published books) talked about managing via replica sets instead of deployments. That was a major source of confusion for me when I was getting started.

welcome to what it's like to work at Google

borg is easy compared to k8s and even a hosted service like gke.

tl;dr if you're a team of developers sitting around a table shouting at your monitor because K8s isn't working, hire someone to do the job right for you. I think you'll find it's pretty easy and utterly worth every penny invested in it.

> Edit: this does not mean k8s is bad, but rather that we are probably not the right use case for it!

Or it could mean you're not the right people for it?

Do you have a dedicated individual or team that represents infrastructure engineering? Are you a team of developers trying to play the Ops game? More often than not people/companies that struggle to get K8s going or use it are a team of developers who's expertise is code and not infrastructure/automation.

K8s is hard when done from scratch (there's a book and a video course about that by THE K8s guy.) It's not hard when you use Kops, Tectonic (discontinued, I believe), EKS, or one of the other managed services. Implementing a YAML file and launching it isn't hard. Also, you don't have to understand it inside out; you only have to understand how-to use it. Can you tell me how every component in your microwave works or do you just get on with cooking your food?

And to say you're not the use case for it - that's highly unlikely. I mean, it's possible, but it's unlikely. Your work loads and software solutions would have to be highly unique and or specialised because hey, it's not like Google's setup is simple, right?

But for personal projects, which is the context of TFA, you're not going to hire a k8s guru. As others have already said, as long as your k8s setup is working as expected, everything is fine (the happy path). But if there ever is a problem with it, then you're confronted with insane complexity, and said tools/setups on top of k8s won't fix it for you; they're even working against you as you won't have learnt the k8s foo for diagnosing wtf is the problem. And in my experience, problems occur rather sooner than later. For example, just the other week a customer of mine had their IPs exhausted on their Azure cloud. Azure and k8s will just silently fail with timeouts and all kinds of weird errors but won't tell you the root cause, and it took two experts, a fresh setup of the k8s cluster from scratch, and almost a week of downtime to fix it. It doesn't help that you only have limited diagnostics on other people's clouds either.

The content of the comment wasn't that of a personal project. It was the context of an organisation, so I'm addressing that concern directly.

> Do you have a dedicated individual or team that represents infrastructure engineering?

Great idea! We could call that team 'Operations' and they would do all that stuff so developers don't have to!

source? not finding anything for "the k8s guy" on google other than this post.

"Kubernetes the hard way author": https://github.com/kelseyhightower

thanks, I'm currently reading Kubernetes Up and Running.

> and every single time productivity dropped significantly

I'm skipping job postings which require k8s. It just tells me they're not competent, falling for the hype. For AWS, ECS is way simpler and is a fully featured orchestration tool. I had an interview today, where I tried to educate why simple is better, and why they don't need k8s. No luck :)

If you design your app properly you won't ever need to run SQL queries by hand. This is a bad practice and it is good that k8s discourages such behaviour. I agree with your other points though.

SQL is too low level?! I don't think you understand what is going on with the tools you use.

He's advocating using ORMs rather than writing raw SQL queries

Okay, now what does Kubernetes have to do with ORMs? Why does it even have a say in what applications we're running on top of it? Am I missing something?

In another tangent, I still use raw SQL queries in most cases over ORM. That's just me. May be I'm a control freak or I just dont know how to magically use the abstraction of ORM and still get the most optimised results.

In a complex environment, that will not always work. We use an ORM at work for all simple CRUD stuff, but when you want something more complex and performant, we end up writing sql anyway.

(On a db the with millions of daily transactions)

I am not. I am sorry if that sounded ambiguous. By "By hand" I meant SSH into a container and then pasting raw query into SQL client. ORM will not always help you achieve what you want so writing queries is alright, but you need to have tests too. You can create a command for your app (like a Django command for example) that performs the query, but you also create tests that prove it is doing what you think it is doing. Then you run this command in the cluster. This way you can replicate it to different environments etc.

I am happy to write SQL, but you need to write it in such a way that you have tests confirming your query does what you intend to. Then run a query as a short lived application on the cluster. You don't need to SSH even for low level stuff. I am sorry if I didn't make it clear.

That a rather naive position. Over time one periodically needs to poke individual instances to see what's going on in them even if just to ensure consistency of what the aggregation sees vs what the instance is doing before destroying the instance one has logged into.

You have no idea what you're talking about to make a blanket statement like that.

I think you are confusing the experience as a user of the app (who shouldn't really ever touch SQL) and the experience of a developer (who should be knowing SQL inside out and using it in their day-to-day work).

"If you design your app properly" applies if you're talking about using k8s to deploy an app.

If you're doing individual development, however, it's quite likely that you're also doing development of the database, and need to do ad-hoc data analysis before you include those queries in that app.

Never needed to use SQL for debugging, or just generally checking that what you have in the database is what you expect?

"I have a problem doing X"

"Then don't do it. It only shows you're an idiot anyway."

Not a good answer.

I'm beginning to think that what Kubernetes needs by far, is a better development environment story.

1. Something that makes it easy to run the application(s) you're working on locally... 2. ...but have it interact with services in a local kubernetes cluster easily. 3. Something that makes your entire development environment effectively immutable, read-from-files the same way as the immutable infrastructure we're talking so much about. 4. Something that gives you a good start to running in production, so the first day of bringing your service up isn't a mess of trying to find/remember what environment variables/config options to set. 5. Something that's one tool, instead of a mixture of `minikube`, `kubectl`, `docker`, `draft`, and `telepresence`. 6. Something that's opinionated, and follows "convention over configuration", with conventions that make for the best future designs.

Basically, we need something with a Rails-like impact but in the Kubernetes ecosystem.

Docker ships with swarm, and swarm + a compose file does everything you just described.

Swarm is such a pleasant and easy alternative that comes out of the box with Docker, does 90% of what Kubernetes does, and does 100% of what anyone with less than 50 servers needs. The learning curve is also an order of magnitude easier.

Docker started supporting local k8s as a swarm alternative, I think from last month's release.

I'm reading that it's only for Docker EE or Docker Desktop (Mac/Windows).



I just installed Docker via package manager so I guess I never really learned about the organization's offerings in total and didn't know which product I had (CE for Linux). Is there a more up-to-date/better practice for using Docker CE with k8s than this guide dated March[0] using Minikube[1]?

[0]: https://blog.sourcerer.io/a-kubernetes-quick-start-for-peopl... [1]: https://github.com/kubernetes/minikube

Totally agree. Docker Compose is excellent for local development, and Docker Swarm mode uses the same file and is almost the same thing. Some minor additions in extra docker-compose files and it's ready for production in a cluster. I don't get why Swarm is so underrated. For the same $5 bucks in a full Linux VPS with Linode or DigitalOcean (or even the free tier in Google Cloud) you can get a full cluster, starting with a single node, including automatic HTTPS certificates, etc. Here's my quick guide, with a Linux from scratch to a prod server in about 20 min. Including full-stack app: https://github.com/tiangolo/full-stack/blob/master/docker-sw...

Fully agree. K8s is mostly beneficial for larger companies, because of namespaces support, and devops authentication. For smaller teams Swarm is just way more practical and versatile.

Docker (at least on my mac) now comes with kubernetes built in, so you can just click a button and instantly have a kubernetes cluster running.

As for an app deployment tool I've had good success with Helm. It allows you to define all the services, app, variables, etc. that your app needs. Though the templates do seem to be a bit convoluted and hard to read at times if you aren't careful.

+1 for docker for Mac, -1 for Helm. Helms templating is just too simple for the actual complexity of k8s, but it can make a decent starter too.

I fully agree. I always found/find Helm underwhelming. Maybe I'm just using it wrong.

What I want is a Maven for Kubernetes. With transitive dependencies, strong conventions, injection of parameters etc.

The templating stuff feels very...unstructured to me. Every Chart is slightly different.

I share the same sentiment. Still looking for the holy grail but the best lead so far is using the combination of Bazel (excellent dependencies support), its rule for wrapping around good templating solutions (jsonnet, ksonnet, etc.) ; and some backend that integrates with infrastructure tooling to generate configurations to fetch into bazel.

Helm + Kustomize [0] to add in whatever wasn't in the helm template works pretty well IMO. Kustomize is pretty new and not the most user friendly though.

0: https://github.com/kubernetes-sigs/kustomize

1. This is something I've found a bit painful, too -- this looks helpful: https://github.com/GoogleContainerTools/skaffold

2. Haven't tried this, but I'm aware of it - does this address your point here ? https://www.telepresence.io/

3. Something like this? https://github.com/rycus86/docker-pycharm I agree that this could be very cool if further developed.

I agree that we'll start to see deeper integration across the development environment though, as more dev-hours are spent on building up the toolchain.

I migrated all of my services to k8s in the last ~6 months. The biggest hurdle was the development environment (testing and deployment pipelines). I ended up with a homebrewn strategy which happens to work really well.

# Local development / testing

I use "minikube" for developing and testing each service locally. I use a micro service architecture in which each service can be tested in isolation. If all tests pass, I create a Helm Chart with a new version for the service and push it to a private Helm Repo. This allows for fast dev/test cycles.

These are the tasks that I run from a "build" script:

* install: Install the service in your minikube cluster

* delete: Delete the service from your minikube cluster

* build: Build all artifacts, docker images, helm charts, etc.

* test: Restart pods in minikube cluster and run tests.

* deploy: Push Helm Chart / Docker images to private registry.

This fits in a 200 LOC Python script. The script relies on a library though, which contains most of the code that does the heavy lifting. I use that lib for for all micro-services which I deploy to k8s.

# Testing in a dev cluster

If local testing succeeds, I proceed testing the service in a dev cluster. The dev cluster is a (temporary) clone of the production cluster, running services with a domain-prefix (e.g. dev123.foo.com, dev-peter.foo.com). You can clone data from the production cluster via volume snapshots if you need. If you have multiple people in your org, each person could spawn their own dev clusters e.g. dev-peter.foo.com, dev-sarah.foo.com.

I install the new version of the micro-service in the dev-cluster via `helm install` and start testing.

These are the steps that need automation for cloning the prod cluster:

* Register nodes and spawn clean k8s cluster.

* Create prefixed subdomains and link them to the k8s master.

* Create new storage volumes or clone those from the production cluster or somewhere else.

* Update the domains and the volume IDs and run all cluster configs.

I haven't automated all of these steps, since I don't need to spawn new dev clusters too often. It takes about 20 minutes to clone an entire cluster, including 10 minutes of waiting for the nodes to come up. I'm going to automate most of this soon.

# Deploy in prod cluster

If the above tests pass I run `helm upgrade` for the service in the production cluster.

This works really well.

thanks for the details, and sorry for the perhaps distracting question: how do you handle DNS for the servers on your foo.com? Did you have to provide nameservers to your registrar so K8s manages the DNS? This is something I don't see addressed so often in k8s tutorials which usually assume minikube or GCK.

> thanks for the details, and sorry for the perhaps distracting question: how do you handle DNS for the servers on your foo.com?

No worries :) I use digitalocean for the nodes, storage volumes, etc. and namecheap to register the domains.

At namecheap I simply use the "Custom DNS" option for each domain to apply the digitalocean nameservers. I only have to do this once for every domain.

After that is done, I can use the digitalocean API, the "doctl" tool or the web-interface to handle domain records, subdomains, etc.

In order to connect a domain/subdomain to a k8s cluster, I point the domain/subdomain to the master node of a cluster. In each k8s cluster I use an "ingress" (nginx) [0] to handle the incoming traffic (directing each request to the right service/pod. You can think of it as a loadbalancer that runs within your cluster).

This strategy would work on other cloud providers (Azure, gcloud, etc) as well, I guess.

0: https://github.com/kubernetes/ingress-nginx

That was really helpful, thanks!

Yes, this! Not having a clearly recommended approach to development against k8 clusters is a huge barrier to entry!

I was really hyped about Azure Dev Spaces (https://docs.microsoft.com/en-us/azure/dev-spaces/), but it's Azure only, even though it sounds like something that should work on any k8 cluster..

Check out BOSH from pivotal labs.

Vagrant seems to be dead stupid simple, you download / git clone a folder and then you run vagrant up and follow up by vagrant ssh and look you're not on a Linux box where you can do whatever you gotta do. Using VirtualBox / Vagrant works perfectly for me so far, I'd love to give containers a shot, but if it's too much a hassle I try to avoid it.

Vagrant doesn't have to compete with containers. In fact, my best experience with Vagrant has been to standardize a development environment that then uses containers inside.

One vagrant VM allows you to make assumptions about what software is installed on a dev desktop which is especially useful if you're writing documentation. Getting into the Vagrant VM from various OSes is the main problem but once you're in... Then you can install container software and teach others how to work in docker given one assumed OS.

Since Vagrant and containers encourage some form of infrastructure/documentation as code, then more experienced developers can read through what goes into the Vagrant set up and replicate on their native OS if they prefer. As someone writing documentation, I can forget about those more experienced devs and just write to one OS.

Once you have documentation that you can guarantee works when using container software, much of the headache can be swept under the rug for more junior engineers which I think is the krux of the issue for others when using k8.


You've hit the on the head here. Kubernetes is like Java of cloud infrastructure if interfacing directly with every vendor could be seen as C.

I'd very much like a Ruby/Python/Rust of cloud infrastructure to come along and clean up this mess.

Isn't that a PaaS such-as Google App Engine?

That's more like Excel. I'm thinking something more like Heroku, but standardized/OSS.

Have you seen Dokku? https://github.com/dokku/dokku

Might be something like what you're after.

Dokku is great, I have been using it for years. The only downside is a lack of scaling/clustering.

Docker swarm could be the step after dokku for when you'd ever need scaling and the like rather than just a bigger VPS for when the first one starts to feel small.

I think you are confusing App Engine with Apps Script. Doesn't help the names are so similar... App Engine is like Heroku.

(I work for GCP)

Openshift is this. https://www.okd.io/

I don't think you're being fair with your comparisons to DO when it comes to price. The f1 micro $5/month Google servers are SO much worse.

Those "always free" f1 micros have 600mb of RAM, do not have SSDs and according to some tests[0] you only get 20% of 1 vCPU. There's also some pretty severe write limitations.

That's a really big difference vs DO's 1gb of RAM, SSD and 100% of 1 vCPU. The performance difference alone with SSD vs no-SSD is massive, especially on a database and while most web apps aren't CPU bound, 20% of 1 vCPU is bound to run into problems beyond a simple hello world that you set up in your example.

Also what happens if your app doesn't even fit in that 600mb? On DO you could get a 600mb app + 100mb OS + 150mb DB + whatever else running on that $5/month server since it has 1gb of ram.

In a real world application I wouldn't be surprised if that cluster performs way worse than a single DO server in the $5/month range. In some cases your app wouldn't even be able to run.

[0]: https://www.opsdash.com/blog/google-cloud-f1-micro.html

This is part of the reason why I use a similar setup as OP, but with a single n1-standard-1 instance (1 vCPU/3.75 GB RAM) instead ($7.30/mo w/ preemptible). The 3-node requirement of GKE is not a hard requirement apparently.

Of course, that's mostly for running Jobs; a single node + preemptible with a production app is not a great idea.

Great article!

I‘m running a full-featured three node cluster for less than $10/month on Hetzner Cloud. This kind of setup requires a little more effort, but it can be fully automated and you learn a couple of things about Kubernetes on the way. I published and maintain this guide[1], including provisioning using Terraform and a bunch of example manifests to get you started.

[1] https://github.com/hobby-kube/guide

One thing you might want to look at based on a quick scan over your guide is that, at the moment ,it looks like you're running your etcd cluster with no authentication.

Any attacker who was able to get on to the pod network (e.g. if there's a vulnerability in one of the apps running on the cluster) could hit the etcd endpoint and dump the contents of the database, which generally includes sensitive information like k8s secrets.

Indeed, see https://github.com/hobby-kube/guide/issues/6

Edit: I think this was you.

ah yeah sorry I'd forgotten I mentioned it before.

How long did it take you to migrate from your previous infrastructure (including time spent learning)?

Although this is a valid question, it doesn't apply here. See, I try to teach myself something new every year, whether it's a new programming language, platform or whatever else.

It was curiosity that turned my non-existent infrastructure into having a cheap, small-scale Kubernetes cluster ready for my shenanigans.

My tiny cluster gives me great freedom. Setup is fully automated. When a new k8s release comes around, I simply tear down my old cluster and ramp up a new one in mere minutes. If I want to deploy a service or application, all I need to do is to write a couple of manifests and roll all the moving parts out. Hassle-free - no SSH, no additional costs, no accounts, no 3rd-party services, persistent storage and TLS certificates out of the box.

Currently, a Unifi controller is the only long-term resident on my cluster; other services and projects come and go. Besides that, the following agents are deployed as Kubernetes pods: Datadog (free tier) for basic monitoring & alerting and Papertrail ($5-10/month) which acts as a centralized log sink for all my hosts, NAS and networking gear.

Thank you for your answer. All this makes sense, I am trying to ponder how much of a migration would be cool-driven and how much of it utility-driven, for a small company. All in all, as of now, my conclusion would be not to migrate infrastructure to k8s with no allocated devops engineer on it (at least one).

This guide is great. Thank you!

nice guide, the only problem i see with hetzner cloud is that they do not allow to have a operating system like coreos or even the coreos (Container Linux) fork flatcar or even RancherOS.

>> You don't have to learn systemd; you don't have to know what runlevels are or whether it was groupadd or addgroup; you don't have to format a disk, or learn how to use ps, or, God help you, vim

That's a quote from the article. The thought that someone with that mindset is responsible for anything more than a network-connected waffle iron is terrifying. This article advocates for willful ignorance of Unix/Linux, because you don't need any of those things if you know k8s.

That said, k8s is nice for some things. I admin a large deployment at work, and it's relatively painless for developers to be productive with. One of the reasons for that stability isn't k8s itself though - it's the legion of seasoned Unix/Linux pros working alongside me. The reason "it just works" is because under the covers, there are a whole lot of us doing all the boring stuff to hundreds of physical hosts.

I'm sympathetic to the author's argument, UNIX administration is a full time job on top of whatever system you're using to do deployments. You leverage your ops team at work; is ops-as-a-service really all that different?

If one of my devs wants to deploy a rails app I don't expect them to suddenly be a sysadmin or network engineer or infosec expert. I expect their application to be well written but the rest of the stack is up to ops.

True, deploying stuff in an actual Unix environment while ignoring need of these tools is the perfect recipe for unstable setups.

Although I guess for personal projects it could actually work as long as the load is low enough and when there is no data involved that needs to be protected. ;)

This article, and a lot of others, seem to be using an overloaded meaning of "personal".

All the arguments stem from the assumption that personal projects have the same uptime, scaling, and deployment requirements as some large commercial site. Some personal projects might but I'd argue most of those are really commercial in the sense that you're developing them for a portfolio or the like. They're not personal.

It makes sense to use business tools like kubernetes when you're making a demo business site. It doesn't make it a good platform for personal projects.

The best platform for those is hosting from home on a single computer, for free, with total control. ie, personal.

I disagree. Even for personal projects I want to be able to access them from my phone while out and about. Something running on my home computer (which is turned off when I'm not there!) doesn't help. Nevermind that my home internet isn't fast or reliable.

If you include running my own blog/mail/XMPP/matrix servers in "personal projects", then I would love to have perfect uptime.

Additionally, being able to reuse high quality helm charts here just makes things simple instead of having one set of instructions for "personal" use and one set of instructions for everyone else.

Is your ISP and service really that bad? If it is I think you have bigger problems you should address before working on personal projects.

I've been running my personal website from home for almost 2 decades on Comcast and I've never had a problem. I can always access my sites when I'm not home. And additionally I can access all my other media. Even when I only had a megabit or so of upload it was more than enough. Downtimes have been very rare and usually only for a couple hours once per year in the middle of local night.

That said, of course you can't run a mailserver from home. For that, because of residential IP blacklists, I run from a rented VPS. In fact I run webservers on them too, but only for hosting large files. It's easier to just cp whatever ~/www than always have to upload things.

> Is your ISP and service really that bad?

What I can get commercially? yes. the best available to me is 14/1 ADSL. And I'm in a dense upscale urban area. As far as I understand it, this is relatively common situation worldwide.

> If it is I think you have bigger problems you should address before working on personal projects.

What I have done is set up my own wireless link using Ubiquiti dishes to a nearby location with a 100mbps net connection. But this is hugely unusual: both to have building access to add a dish to my apartment building roof, and to have a Line-of-sight location to connect to, that also lets me set up a dish on their roof, and connect it to their network.

> I can always access my sites when I'm not home.

I used to try and do that (subject to internet quality), but then I realised I was spending $50 a month on power bills for leaving my desktop powered on 24/7. The cost of a $5 dollar VPS is far less.

Why would you want to be able to access your personal projects from your phone? What context requires this over waiting and solving the problem at home?

What about work life balance? (yes, it's a personal project, but still)

A personal project might be something that is useful in your day to day life (e.g., a notes app, todo list, or calendar), or something that you want to be able to show to fiends/family when you see them in person (e.g. a photo album, or something related to a mutual hobby).

Sometimes the personal project is an app: it has an API it talks to.

Sometimes the personal project is a website: it has progress tracking that I want to check on.

Sometimes the personal project is for home automation, I want to check the status of it or tell my heater to turn on remotely.

i have a paperless filing system that i frequently access from the office for things like scans of bills that aren’t yet electronic, etc. it’s fantastic to be able to access it all (via VPN; never direct) wherever i like

Often personal projects are for learning new technologies and applying them in a meaningful way. E.g. many people want to run their personal projects on Kubernetes to play around with Kubernetes and understand how apps can be operated with it.

I'll admit that Kubernetes Is a Surprisingly Affordable Platform for Personal Projects when your personal project is to learn how to use Kubernetes.

Well yeah, I'm sure the time/money will pay for itself when he's looking for a job. I know I look favorably at homelabbers during technical interviews and DIY is often the only time you can get exposure to $new_hotness at some jobs.

> The best platform for those is hosting from home on a single computer, for free, with total control. ie, personal.

Or just get a cheap DO box. You can run quite a lot one, and have it always accessible, on the internet, etc...

how do you get a 'cheap DO box'?

I believe he is talking about digitalocean.com. Their smallest offering is $5/mo for 1vCPU, 1GB ram, 25GB ssd, and 1TB transfer.

Huh. This competes with a raspberry pi plugged into a home network (and is likely more reliable).

It really depends on your personal project.

Kubernetes is a complicated, complex beast. Even on managed K8s on a cloud provider, you still need to learn a lot of stuff that has absolutely nothing to do with what you intend to accomplish (unless your personal project is learning to use K8s.

Most of my personal projects get deployed on Google's App Engine. It's easy, simple and unbelievably cheap.

There was a fantastic presentation at the last IO about riding as much as you can on free tiers:


Sure if you put enough effort in you can make everything usable and cheap. But why put in the effort into that instead of investing it into building a system that naturally fufills such requirements?

Kubernetes was never intended as a system for hobbyists, but for bringing more users onto google cloud. At that it is incredibly successful. But in exchange a lot of other stuff it does really badly.

It is much better to start with a single-node system in mind that expects getting more nodes at some point and develop from there. This will bring in a lot of hobbyists and really carves out how to solve the developer story. Then this will grow automatically with the size of popular projects building on it.

That said, of course it's nice if people invest energy into finding other usecases for k8s as well. But it shouldn't be the community's main effort.

It looks like it isn't surprisingly affordable on Amazon. If I'm reading this right, EKS is 20c an hour per cluster, or about $150/month, and that's before EC2 costs for the actual machines in the cluster.


You can run Kubernetes on AWS without EKS and if you use t2.micro instances it would be around $28 per month if you use the load balancing technique here (where it is DNS based on an actual LB). $21 a month if you account for the free tier (and this article uses Google's free tier).

If you further use spot instances (the equivalent here to "preemptible") you can save a ton more.

The real savings (or $7 per month) is that on GCE you won't need to run your control server. GCE does that for you for free.

Edit: now another issue here is (and I have no idea if this is true for GCE like it is AWS). t2 instances will throttle you if you use too much CPU which can be an issue when running a lot of containers.

Sure but the context is personal projects or portfolio/small business sites. A VPS on a discount provider such as Hetzner or OVH costs around 3-4 USD/EUR per month. A redundant VPS (two VPS and Ceph) costs 10 EUR on OVH. Not to speak about the complexity of setting up and running k8s. You'll be needing a second k8s development/testing environment for staging as well. So, no, I don't think k8s makes sense economically.

You don't need EKS. Just use kops to build your cluster on plain EC2 instances.

EKS is indeed expensive. I started creating https://github.com/cloudboss/keights for automating Kubernetes stacks before EKS was a thing, but I still prefer it over EKS. You can use it to set up a cluster using plain CloudFormation, or with an Ansible role which wraps the CloudFormation stack.

I tried out Kubernetes (managed with kops) for a personal project. I had a single websocket endpoint. I wanted to terminate SSL at the load balancer. Found this to be extremely complicated and the troubleshooting resulted in a ~$100 AWS bill.

The k8s system containers are quite resource hungry. The author is clearly very good at k8s (I enjoyed the article), cos running it cheap ain't easy

I'm surprised in all of these posts about k8s on HN the Nomad project (by Hashicorp) rarely comes up. I've found it an absolute joy (yeah, I said it) to use and a breeze to setup. Unless you're setting up a container orchestration system where you want to customise every single aspect while you're setting it up, my guess is you're better off with Nomad at first. At the very least you'll learn what you need from a container scheduler and can make a more informed decision if you re-evaluate your options down the line. Nomad also integrates perfectly with Consul and Vault, so you're never left wondering "is this is how I'm supposed to do it?" which is something that happens if you're just starting out.

URL: https://www.nomadproject.io/

Check out that hashicorp blog over the last three weeks. Consul integrates tightly to kubernetes now. Vault already did.

Consul only integrated "tightly" with kubernetes in the past few weeks. The vault integration is pretty barebones, and anything done there has been spearheaded by the Kubernetes community.

I had a project about this size on GKE, in an attempt to get a feel for Google hosted Kubernetes product. I went on vacation for two weeks while a container was printing an error (unbeknownst to me). Google wanted to charge me ~700 dollars for StackDriver logging volume (a feature I was not aware was connected to my Kube pod stdout).

Google Cloud refused to refund or provide any guide as to how to entirely disable / purge StackDriver, so I'm back on AWS and won't be recommending anyone move to GKE any time soon (it's extremely extremely easy to suddenly more than 100x your bill...)

With my hand-built Kube on AWS I have 15gb assigned for my ELK logging cluster, and it costs me nothing because it falls under the free tier. Looking forward to trying out Digital Ocean's product next!

to be fair, if you have CloudWatch and send logs there it can get pretty expensive too

From my experience it's quite expensive, at least in GKE. As Kubernetes takes a lot of cpu & memory by itself. I run a pool of one node n1-highcpu-2 (2 vCPUs, 1.8 GB memory) and it costs me about 72 euros per month.

kube-system is taking about 730 mCPU of the node's 2000 mCPU.

The issue is that the kube-system is deployed on each node of the pool so (tell me if I misunderstood/misconfigured something). If you have a pool of 3 nodes which have 1 vCPU each, with kube-system taking approximately 700 mCPU on each you have only 300 * 3 = 900 mCPU allocatable.

If you have any tips on how I could reduce the costs of my personal projects I'm listening !

I just saw a change the other day which makes system pods fit better on nodes with one vCPU. It reduced their numbers by the same factor (i.e. keeping the same proportions). It was meant for GCE and I am not sure if/when it will make it to GKE, but it looks like Google is looking at the tiny node scenario. Also, Tim Hockin led a discussion about something similar at last year's Kubecon.

Here it is. It's in 1.12, actually, not on HEAD as I recalled:


Still not sure if these are the configuration files that GKE uses behind the scenes, but the author is a Google employee.

When creating to disable the cluster, have you tried to disable Kubernetes dashboard and StackDriver monitoring and logging services?

For logging, you could use FluentBit which is more lightweight: https://docs.fluentbit.io/manual/output/stackdriver

As a data point, on a busy GKE production cluster with 12 highcpu-16 nodes we have kube-system pods taking a total of ~700mCPU on all servers combined. I'm not sure how the CPU usage scales, but probably not linearly.

Most of the CPU usage is kube-dns, which I think requires much less CPU if you make few internal network connections.

I've seen a few comments here about k8s using a lot of CPU by itself. Do you known exactly what it does that uses a lot of CPU?

Or you can push your webapp to Appengine standard and it's 0$ per month =) Appengine requires you to learn some new stuff, but so does k8.

Not the right choice for all apps, obviously, but something to compare this setup to.

I need to host a web API and Postgres database in a high availability configuration. I only need a single instance each of the web API and database, but uptime is crucial.

I'd like to use containers, partially because it gives an easily reproducible environment for running locally.

For such a small setup, what's the recommended way of handling this, such that I get automatic failover and can easily do zero-downtime updates?

I did look a bit into k8s, but quickly came to the conclusion that it's far too complex for this.

Oh, I'm also constrained to using the Azure cloud, if it matters.

No-one has mentioned Convox yet: https://convox.com

Kubernetes is great, but Convox is way easier to set up. It's much closer to having Heroku in your own AWS account. It also leverages a lot of AWS services (ECS, RDS, etc.) instead of re-inventing the wheel. The only downside is that it only supports AWS.

I'd recommend Convox for personal projects, but it can be a bit expensive because you have to run a minimum of 3 EC2 instances.

In the past I've had similar success with Cloud 66.

I set up a single-node cluster for a personal project on DO and it took me some time, but it's up and running and has been fine for over a year. I used a decently beefy instance ($20/monthly on DO) because the Kubernetes processes are actually too resource intensive for the $5 instance. It runs several processes (the main Python app and a helper Go-based API), automatic SSL renewal, a daily cleanup Cronjob, etc.

I really like being able to push a completely new container whenever I push to master - this is done with CircleCI, which builds the image, pushes to Docker Hub, and just does the kubectl apply. I couldn't think of any other way to do it that wouldn't result in some downtime if I had to change or upgrade packages.

I wrote about it here a bit: https://hackernoon.com/lessons-learned-from-moving-my-side-p...

> I couldn't think of any other way to do it that wouldn't result in some downtime if I had to change or upgrade packages.

https://github.com/helm/helm ?

You push the new version of your app as a helm chart to your helm repo and then run:

    helm upgrade my-app

no - I meant, outside of Kubernetes, in the past (sorry, my wording in my post was a bit confusing). My app used to just be something I would ssh into and then do a git pull, but whenever I had to upgrade packages it would be tricky. Then I tried Docker with docker-compose and I had similar issues with environment variable changes. Kubernetes is great; all I do is kubectl apply -f new_config.yml and everything just works without downtime. I only very recently learned about Helm when doing automatic SSL upgrading, but it looks cool too.

In general I really don't think k8s is the monster people make it out to be. I did all this on the side over the course of a few weeks, a few hours here and there. I know I'm only scratching the surface of what is possible, but everything just works fine, and my .yaml files are pretty self-explanatory.

I'm trying to set up a deployment pipeline, quite from scratch. That also includes picking source repository software. Self-hosted is a must for every component. So I chose gitlab and after that I dived into it. Oh boy was I in for a surprise. For CI/CD one needs docker, kubernetes and whatnot: https://docs.gitlab.com/ee/topics/autodevops/ But the alternatives don't look better either.

So, fellow hn'ians. My requirements ATM are quite simple: Submit code via git, run automated test, if all good, put the files into the live system. Pull-requests are must and everything must be self-hosted because cloud is just another persons computer. Code is written in python. What would you recommend? Edit: grammar.

I don’t mean this snarky at all, but I’m stunned by the fact that you think a distributed container orchestration platform (which Kubernetes is) is needed to do CI/CD, even when self hosted. It shows I guess how hazy these topics still are. Have a look at Jenkins or Gitlab or Bamboo or any of the other dozen of well known CI server out there. They will pull, test and build your code. Then look at something like dokku or Capistrano or even Ansible to take care of putting builds on servers. With a little bash scripting you have a full pipeline. No Kubernetes needed at all.

Believe me I don't wan't to dive into Kubernetes and learn a beast like that for CI/CD but it says so in the GitLab documentation referenced in my original comment.

The documentation link you shared was for AutoDevOps, a different (but related) feature from CI/CD. It's a build and deployment solution specifically for Kubernetes/Containers, which is why it has those requirements.

GitLab CI/CD documentation can be found at the link that I shared, and it can build and deploy truly all kinds of software, and does not require k8s.

Hey there, I'm the product manager for CI/CD at GitLab. AutoDevOps is one of our more advanced features to get up and running in a k8s environment quickly, but is not required if you just want basic CI.

Our documentation on getting started with CI/CD is here: https://docs.gitlab.com/ee/ci/

I might be wrong about this, so feel free to correct me. Since you're here...

AutoDevOps has a good example of GitLab licensing that frustrates me. It's the "Support for multiple Kubernetes clusters" feature. Running development and production in separate environments seems like a best practice. A mistake that brings down a cluster is really the type of thing you'd want to catch in development or testing, right? However, IMO, if you're a small developer using Core or Starter, GitLab is forcing you to do something subpar by using the same cluster for development and production.

I think "Environment-specific variables" is the same situation where Core and Starter users are lead into bad practices. For example, ASP.NET Core uses ASPNETCORE_ENVIRONMENT which (unfortunately) defaults to "production". The lack of environment specific varaibles means people will put that in the job config of .gitlab-ci.yml.

It also means protected pipelines will be given all the variables needed for deploying to all environments. Ex: DEVELOPMENT_DB,PRODUCTION_DB. I'm sure you can see where this is going, but, if you're using something like EF Core migrations, all it takes is for someone to mess up the (ex:) "deploy_dev" job's ASPNETCORE_ENVIRONMENT variable on protected branch and the CI build is going to end up in "production" mode with access to all the variables needed to break a production DB.

GitLab Flow has a diagram showing master, which is a protected branch, deploying to staging with another branch used for production. I like that workflow, but it means all deployable branches end up being protected branches (I think that's the intent ?) which means production credentials end up exposed everywhere but feature branches.

I don't know how protected variables interact with environment specific variables, but, ideally, I'd want to have ASPNETCORE_ENVIRONMENT be an environment specific variable and PRODUCTION_DB be both environment specific AND protected. The more safeguards the better and navigating into the "production" environment variables page is a good way for me to get into "be super careful" mode.

I think small developers will tend to be the ones where people merge their own changes, so they're especially vulnerable to the type of mistake I just described.

Thanks for the feedback Don.

For your particular use case, you could use the same cluster for multiple environments if you have separate namespaces and use separate service accounts for each. Using RBAC will further ensure you don't have any collisions, this way if there's an error in development, the production namespace will remain untouched.

When deciding which features are paid vs free we ask ourselves if the particular feature may be more relevant to larger organizations; that doesn't necessarily mean there won't be a use case at smaller companies, but just wanted to let you know our reasoning. You can read more about it here: https://about.gitlab.com/stewardship#what-features-are-paid-....

I understand it's tough to tier the features and I almost wouldn't have a problem with all the K8Ns stuff being in a higher tier. I'm not really the target market for it. I just want to learn about it because I think it could trickle down to smaller developers in a few years.

The environment specific variables are more frustrating for me because I would consider good environment isolation a necessity, rather than something that's subjectively better / worse depending on the size of an organization.

I think everyone driving deployment from CI should, at a minimum, start with two environments - 'dev' and 'prod'. The number of environments would have been a much better feature to tier IMO. I don't really need 'test' or 'staging'. To me, those extra environments seem more practical for a company that has a strategy for promoting deployments as a part of their QC policies.

CI in particular is an area where I think the the product tiers might be a mistake. The goal should be getting everyone into the mindset of "use GitLab and follow their opinion for CI and you can one-click deploy to and scale on X, Y, or Z". The promise of cheap, consistent deployment options with the ability to "throw money at it" to scale up is what makes sense for me. It's unlikely I'll ever have a product that gets popular enough that I need to use the scaling features, but that's what everyone dreams about so it's easy to capture the low end with the promise of low / no upfront costs and the ability to scale if you win the lotto.

Thank You! I will look into that. OTOH this is how I came to conclusion that I must go the AutoDevOps way: This is my first installation of gitlab ever. So naturally, after somehow successfully installing it on the 2nd try, everything seems to work. So - lets explore configuration options. For example we should disable registration since, well, we don't want to be a public gitlab host. After a while, after the Pages section, we come to a section named "Continuous Integration and Deployment". And there's a checkbox named "Default to Auto DevOps pipeline for all projects". And there does not seem to be any other kind of "Continuous Integration and Deployment" available so one naturally comes to a conclusion that the only way seems to be Auto DevOps. My bad.

Hi ratiolat, another GitLab PM here. Thanks for the feedback, as jl-gitlab mentioned, we'll work to make that clearer. You can think of Auto DevOps as a CI template that has a job for all devops stages built-in. If you want to experiment with kubernetes first you can easily add a kubernetes cluster from GitLab and experiment using with CI with or without auto devops. More info here: https://docs.gitlab.com/ee/user/project/clusters/

No worries, I understand how you could come to that conclusion. I've opened up an issue to see if we can make that more clear: https://gitlab.com/gitlab-org/gitlab-ce/issues/52144

there are _many_ self hosted alternatives to gitlab. Nothing against them, just saying :) https://gogs.io/ https://gitea.io/en-us/ <= The golang ones are deployed with: Copy the binary, run the binary.

Sounds like you can get away with a bash script that calls rsync, and restarts the code on the server with ssh. (I deploy one of my websites this way, simple & reliable)

If you're looking for something more legit: Ansible

Heroku. It is everything you want.

I'm quite happy with gitea + drone currently.

app engine + cloudbuild.yaml?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact