Hacker News new | comments | ask | show | jobs | submit login
You might not need Kubernetes (jessfraz.com)
307 points by tannhaeuser 3 months ago | hide | past | web | favorite | 308 comments

Some day I would like a powwow with all you hackers about whether 99% of apps need more than a $5 droplet from Digital Ocean, set up the old-fashioned way, LAMP --- though feel free to switch out the letters: BSD instead of Linux, Nginx instead of Apache, PostgreSQL instead of MySQL, Ruby or Python instead of PHP.

I manage dozens of apps for thousands of users. The apps are all on one server, its load average around 0.1. I know, it isn't web-scale. Okay, how about Hacker News? It runs on one server. Moore's Law rendered most of our impressive workloads to a golf ball in a football field, years ago.

I understand these companies needing many, many servers: Google, Facebook, Uber, and medium companies like Basecamp. But to the rest I want to ask, what's the load average on the Kubernetes cluster for your Web 2.0 app? If it's high, is it because you are getting 100,000 requests per second, or is it the frameworks you cargo-culted in? What would the load average be if you just wrote a LAMP app?

EDIT: Okay, a floating IP and two servers.

As somebody who has his own colocated server (and has since Bubble 1.0), I definitely agree that the old-fashioned way still works just fine.

On the other hand, I've been building a home Kubernetes cluster to check out the new hotness. And although I don't think Kubernetes provides huge benefits to small-scale operators, I would still probably recommend that newbs look at some container orchestration approach instead of investing in learning old-school techniques.

The problem for me with the old big-server-many-apps approach is the way it becomes hard to manage. 5 years on, I know that I did a bunch of things for a bunch of reasons, but I don't really remember what or why. It mixes intention with execution in a way that gets muddled over time. Moving to a new server or OS is more archaeology than engineering.

The rise of virtual servers and tools like Chef and Puppet provided some ways to manage that complexity. But "virtual server" is like "horseless carriage". The term itself indicates that some transition is happening, but that we don't really understand it yet.

I believe containers are at least the next step in that direction. Done well, I think containers are a much cleaner way of separating intent from implementation than older approaches. Something like Kubernetes strongly encourages patterns that make scaling easier, sure. But even if the scaling never happens, it makes people better prepared for operational issues that certainly will happen. Migrations, upgrades, hardware failures, transfers of control.

"5 years on, I know that I did a bunch of things for a bunch of reasons, but I don't really remember what or why."

For my home servers, I've settled on "a default install of distro $X and an idempotent shell script that sets everything up for me". You have to use discipline to do everything in the shell script rather than simply fix the problem, but if you can do that, you end up with documentation as to how your server differs from a default install, and the ability to recover it again reasonably well if you store it in git somewhere or something.

It's only "reasonably" well because when you have one server running for years at a time, your script decays more quickly than you are going to fix it. If your server goes down three years later, and you decide to go with the latest $X instead of whatever you used last time, then your script will be out of date and need to be updated. It isn't nirvana. But it's the best bang for the buck when you're in a situation where chef/ansible/puppet/etc. is massive, massive overkill.

If you're already an expert with Docker, go nuts, but IMHO it's a bit silly to run a server just to run two Docker containers, just so you can say you're running Docker or something. Plus no matter how slick Docker has gotten, it's still more of a pain that just setting a few things up.

From my perspective, a Dockerfile already is an idempotent shell script that sets everything up for me. With the advantage that I can easily write and run tests for it that verify that the app comes up just fine.

The main struggle for me there is existing apps that weren't made with Docker in mind. There, using the OS install tools can me easier. But I think that's changing. The Docker Postgres images, for example, let you configure key things via simple environment variables: https://hub.docker.com/_/postgres/

So I expect that we'll continue to see more and more apps provide Dockerized versions, gradually chipping away at the advantage built up over the years by OS packaging.

Huh, I haven't found Docker to be a pain at all now that I sort of vaguely have an idea of what I do.

A docker file takes maybe ten minutes, and is really documentation more than anything.

That with a tmuxp yml file to set up a tmux session for developing can pretty much outline both how the product is released and how it's developed for anyone coming into the project.

Pretty neat, super easy, very cool.

I'm not really doing docker to say I'm doing docker but because once I realized how easy it is to containerize things it's not much more than a few steps to have a development environment as well as a production environment even for my crappy little website.

> That with a tmuxp yml file to set up a tmux session for developing can pretty much outline both how the product is released and how it's developed for anyone coming into the project.

Would you mind sharing more about your team uses tmuxp? Sounds like an interesting alternative to a README for shared configuration etc.

Hey pcl, So I discovered tmuxp relatively recently and am between jobs at the moment but I'll tell you that for my personal projects I can look at my yaml file and immediately see that there's a gulp dev command which is run in the front end directory, a sync bash script which is run, and a gmake run which runs the server.

It's nothing groundbreaking, but it's nice to have it all laid out and it's possible if I got to the point where someone else was working on the same project they'd find it useful to know these three commands without having to wonder why their static assets weren't updating on change, or why make didn't work.

I think wherever I end up I'll likely start creating tmuxp files and possibly docker files for any repos I work in, mainly so it's super easy for me to hop on a terminal, type one command, and have a whole environment to work in. It is pretty neat to have a server start, a watch, a sync, and two windows for vim for front and back end.

How can Ansible be "massive overkill"? It's literally an interpreter of scripts, just like sh or bash. It doesn't require daemons or other infrastructure, just connects over SSH and runs the script.

docker image definitions are idempotent as a matter of principle. creating an idempotent shell script is non trivial IMO - e.g. what return code is returned from package manager XY when something is already installed etc.

Really? What do your dockerfiles look like? Most of the ones I've seen in the wild look something like:

    FROM debian:jessie
    RUN apt-get install -y somepkg
What happens when the "debian:jessie" image changes (as it does weekly on dockerhub)? What about "somepkg" in debian's repositories?

The answer is that `docker build` will produce a different image. In fact, very few docker image builds I've seen are idempotent. They're not declarative, they're not reproducible, merely the output (the docker image itself) is itself able to be run reproducibly. The actual image definition, not so much.

Creating idempotent shell scripts is no harder than creating an idempotent dockerfile. Both are the same problem. A dockerfile is almost entirely the same as a shell script; it copies files around, it runs commands in an environment, and that's all.

that's why you version pin and vet new updates before you let them in.

Okay, so walk me through how to do this in a dockerfile?

My first line becomes:

    FROM debian@sha256:14e15b63bf3c26dac4f6e782dbb4c9877fb88d7d5978d202cb64065b1e01a88b
Okay, that's easy.

Now, what about older versions of packages in debian's apt repos that have been deleted? How do I get those?

I run my own apt mirror I guess which I update in lock-step with my dockerfile and thus don't let the Dockerfile reach out to the network.

Is this any different from what you do in a shell script on a server? You use btrfs/zfs/whatever to snapshot the initial version and back it up, you run an apt repository so you can pin package versions, you snapshot before and after updating...

I don't see how a docker image definition makes any of this easy. There's not even a flag to disallow network access during "docker build".

The claim I'm responding to is "docker image definitions are idempotent as a matter of principle".

The large majority of dockerfiles I've seen are not idempotent. Yes, it's possible to make them idempotent, but they do not make it easy.

> Now, what about older versions of packages in debian's apt repos that have been deleted? How do I get those?

you can version pin your apt packages if you need to, i personally prefer the minor patches so i get my security updates. my build tool will catch if there's a bug affecting my software.

> Is this any different from what you do in a shell script on a server?

yes, because I can take that built image and deploy it to any host and all my developers get to all use the same one in their development. but hey, if you like building with shell, you could try out packer and run that shell script to create an image; which can safely be used on any host that supports docker or kub?

> I don't see how a docker image definition makes any of this easy. There's not even a flag to disallow network access during "docker build".

Easier depends on your goals and perspectives. For me, it's easier to write a docker file that installs what I need to run a service. Where bash doesn't have that, it's just a script that needs an environment to run. where do you run bash? is it locally on your OS with your packages with your settings and your needs? what happens when I run that bash script in a different OS? Who's going to debug that? Are you going to track your changes made in version control? how do you update the other servers/users who use your script? I got other fun things to do than worry about that.

What you describe is still not an idempotent build process, which is all I'm arguing against.

I'm happy to admit docker images are more portable than declarative shell script's output.

You're arguing against something I'm not saying. I'm talking about how easy it is to make script/docker-image-definitions idempotent, not about their usability, not about their distribution.

When I wrote "is this any different from what you do", I meant "what you do to make it idempotent", not is the resulting artifact and usability any different.

Same with "any of this easy", "any of this" was "idempotency", not anything else.

Everything you are arguing against is a strawman based on misreading the intent of my comment I think.

A Dockerfile is an input which produces an image as an output. That image should not suffer from the bit rot examples you gave (e.g. "what about older versions of packages in debian's apt repos that have been deleted?")

However, when security patches are released, your image obviously will not contain them.

I am not arguing that the docker image output is mutably changing. It is a good artifact that can be reproducibly run.

The comment I am originally replying to is 'docker image definitions are idempotent'. Note, 'image definitions', not 'images'.

My point has nothing to do with the image, but with the image definition itself.

Understood, just trying to point out there is still a flaw with the image (in that updates are actually important!)

FWIW at my work, we don't use apt for installing packages. We compile the packages as a part of the Docker build. This generates mostly idempotent builds.

This. Kubernetes (or whatever other container scheduler) might feel like overkill, but if all they do is force you to adopt a container-centric / 12-factor way of building your applications it was worth trying them. And once you've adopted that workflow it's a no-brainer to go from a single node to a cluster which will dynamically allocate the workloads it runs.

Running a small container cluster at work has even changed how I setup single-host projects in my spare time: I will build everything into a container, bind-mount whatever it might need, create a simple systemd unit that just runs / rms the docker container on start and stop. Bliss.

I've found it just pushes the complexity elsewhere or opens up (or silences) performance or security problems you wouldn't have had if you'd stuck to the old fashion way of doing things.

Keep checklists and script what you can. I find it helpful to follow edge whenever I can so if I hit a snag the developers that made the change still have that change fresh in their mind. It really doesn't take that much time to keep things updated if you don't go hog-wild on different libraries and if your stuff is reasonably well tested.

As an aside, I've been thinking that there should be a stack that is designed and built for the sole purpose of staying stable over decades. Something with a bunch of stripped down technology. As robust and stable as possible. Only allow security updates. Only allow certain character sets. Something built on a language that is just stupid simple and secure. A Swift or Rust subset maybe? Lua? Lisp without the macro insanity?

If the future needs some sort of tech that we didn't anticipate (say, something to handle quantum computers breaking cryptography) then the stack should be setup in such a way to decouple the varying layers with minimal work.

I liken it to building codes. We should have pre-setup combinations of technologies that are stable, simple, and combinable. Sure, go outside them for skyscrapers, but for the day-to-day buildings things are getting too complicated.

How do you use checklists in your workflows - are they part of your repository alongside the code, in some documentation system, printed out?

I'm most of the way through the checklist manifesto and I'd love some insight on how software engineers incorporate them into their work.

Funny. After keeping them in files and emails and docs for years I finally decided to systematize it by writing a CLI that I plan to open source one day. If you want, send me an email and we can talk about it further.

I'm curious how docker will help with the "5 years on" problem. I'd be willing to place money saying your docker setup for this week will have trouble running "as is" next month. Especially true for the vast majority of one-offs out there.

For me it separates application environment issues from the machine issues. As an example, I have a daemon that runs my ambient home lighting: https://github.com/wpietri/sunrise

It has been running in a Docker container for a little over 4 years. Moving that from one machine to another was trivial. I didn't have to worry about language runtime or libraries or config files tucked away somewhere in /etc. I just told Kubernetes where to pull the image from and away it went.

That still leaves me with various problems building the app, as I needed to do when I made some configuration changes. But even there Docker was some help. The addition of multi-stage builds [1] means I can describe the build environment and the run environment in one file, giving me an easy way to have a repeatable build process.

Over time, my goal is set up all my home services similarly, so that when I decide to replace my current home server, there won't be a multi-day festival of "what the hell did I do in 2004 to get Plex working?" I'll bring up the new one, add it as a Kubernetes worker, and then kill the old server. I'm hoping it will also make me braver about upgrading server OS versions, as right now I'm pretty slack about that at home.

[1] https://docs.docker.com/v17.09/engine/userguide/eng-image/mu...

Docker-compose is a pretty straightforward infrastructure-as-code tool for casual servers and local-dev. Basically you have a YAML description of your server that you can commit and comment on and pin Docker image versions to and etc. This includes persistence (volumes), networking, dependency management (bringing up the services in the right order), health checking, configuration, environment management, as well as managing the actual services. The only critical thing it's missing is a secrets management solution.

Even on a single-node, I think Docker swarm mode is a better choice than docker-compose. Docker swarm mode is integrated in Docker. You just run `docker swarm init` to enable it. It gives you everything docker-compose provides, plus configuration and secret management, and zero-downtime deployments (docker stack deploy).

This is already happening. People are using "containers" the way that they used AMIs, which is the way that they used VMs: as a black-box execution environment that magically abstracts all problems. Until something breaks. As soon as you have to upgrade or fix anything inside the container, you're back to the same old set of challenges, which remain unsolved.

But this isn't really why people are using containers. Probably the biggest reason that folks are stuck with this complexity (outside of big server farms) is because containers are being foisted upon them by companies who want to sell software, but don't want to solve deployment issues. Why plan for execution in an uncertain environment when you can just require Docker?

I use Docker for my single server because it (and its ecosystem) offers a straightforward path for:

1. Deployment

2. Distribution (I don't have to build a package for every platform)

3. Supervisory

4. Standard logging

5. Configuration management

6. Infrastructure as code

7. Process isolation (not perfect, but I can get some reasonable protection without managing VMs or figuring out how to roll my own isolation and permissioning)

8. Networking

Basically I don't have to be a professional sysadmin but a "mere" engineer (yes, yes, in a perfect world I would have time to learn everything "properly", but for all its faults, Docker lets me build something useful Right Now).

EDIT: For downvoters, I'd really appreciate more elaborate feedback.

Sorry, no idea. Seems like a decent summary of why docker would be interesting to a dev instead of a sysadmin.

I believe those chef and puppet scripts won't help you resurrect a project from 5 years ago. You'll practically have to rewrite all the scripts to get it up and running in whatever new hotness is around in 2023. Package names will have changed, new config systems will be invented, and previous workarounds for bugs will now cause bugs.

You can put everything in containers and still not need much orchestration, though. My personal projects run in dozens of containers, and the "orchestration" consists of a Makefile include pulled into each project that creates a systemd service file based on some variables, and pushes it into the right place. The service files will pull down and set up a suitable Docker container. The full setup for a couple of dozen containers is 40-50 lines of makefile and a ~20 lines or so of a template service file.

Of course it won't scale to massive projects, and for work, I occasionally use Kubernetes and other more "serious" orchestration alternatives, but frankly it takes fairly big projects before they start paying for themselves in complexity.

Meanwhile, my docker containers has kept chugging along without needing any maintenance aside from auto-updating security updates for several years.

I do agree with you that Kubernetes may encourage patterns that are useful, though. But really the most essential part is that you can find relatively inexperienced devops people who have picked up some Kubernetes skills. That availability make up for a lot of pain vs finding someone experienced enough to wire up a much leaner setup.

When you deploy a new version of a container, how do you avoid downtime? Do you start a new container running the new version, wait for the new container to be ready, switch traffic to the new container, stop traffic to the old container and drain connections, and then stop the old container?

For my home projects it doesn't matter. For work projects, yes. It's an easy thing to automate. Incidentally most of the pain in this is that most load balancers are reverse of what makes most sense: the app servers ought to connect to the load balancer and tell it when it can service more requests, not get things pushed at it.

> most load balancers are reverse of what makes most sense: the app servers ought to connect to the load balancer and tell it when it can service more requests, not get things pushed at it

Reminds me of Mongrel2

Yes, Mongrel2 is an interesting design. So many things get simpler when you invert that relationship.

> Of course it won't scale to massive projects

Most projects aren't massive -- at work, 2 years on, we're still using a single instance of a single node, with the only component that needs to be reliable stored as static files in S3.

Absolutely. Which is one of the reasons I find things like Kubernetes overkill for most setups.

> I've been building a home Kubernetes cluster to check out the new hotness

I tried to do this for the same reason, but all of the writeups seem to stop at "getting a cluster running", but that's not enough to actually run apps since you need a load balancer / ingress, dns, and probably a number of other things (ultimately I was overwhelmed by the number of things I needed but didn't completely understand). I haven't had any luck finding a soup-to-nuts writeup, so if you have any recommendations, I'd love to hear them.

I've heard good things from Kelsey Hightower's https://github.com/kelseyhightower/kubernetes-the-hard-way

Will read! Thanks for the recommendation!

> The problem for me with the old big-server-many-apps approach is the way it becomes hard to manage. 5 years on, I know that I did a bunch of things for a bunch of reasons, but I don't really remember what or why.

I thought this was a solved problem.

I use SaltStack for config management & orchestration on my own machines. (I suggest any config management tool becomes 'worth the effort' once you're managing more than a handful of machines, and/or want to rebuild or spin up new machines with minimal effort and surprises more than once a year.)

Why I do something is described in a comment in the yaml that does the something.

For more nuanced situations, I'll document it in my wiki. (With a yaml comment pointing to same -- I am extremely pessimistic about future-Jedd's ability to remember things.)

If you're running a big-server-many-apps or many-servers-with-their-own-apps, I'd expect the same approach to work equally well.

Though the whole idea of virtual servers & config management (but not necessarily docker or k8s) is that you don't have a bunch of disparate and potentially conflicting apps with potentially conflicting dependencies on a single server.

> But "virtual server" is like "horseless carriage". The term itself indicates that some transition is happening, but that we don't really understand it yet.

That's a challenging assertion to fully unpack. IT's undeniably in a constant state of transition -- and not always thoughtfully directed -- but the problem isn't _'virtual server'_.

The general trend is obviously towards isolation -- but the tooling, performance, scaling, design, and security disparities make the arguments around what level you try to implement your isolation so interesting.

I agree that "virtual server" isn't a problem. Neither was "horseless carriage" or "radio with pictures". All of them were steps forward. But they're transitional states on the way to new paradigms.

When servers were expensive things that had full-time staff, the old ways of installing software made a lot of sense. But as server power got cheaper, they became impractical. A virtual server was at least familiar; slicing big machines up let us turn the clock back to when servers were less powerful. But that didn't really solve the problem, as we now had to do something to manage the explosion in the number of servers, real and virtual. Things like chef and puppet jumped in to solve this, but they are IMHO clumsy; it's all the work of managing a lot of servers the old way, even though it may be a small number of physical boxes.

Containerization says: Forget about installing apps on servers; just wrap the app up with what it needs. Things like Kubernetes takes that further, saying: Don't worry about which apps are running on which servers; it'll just work. The impedance mismatch between modern hardware and the 1970s-university-department paradigm that underlies Unix gets solved automatically.

I'm not sure if that's the end state in the paradigm shift. But I'm convinced that the approach to sysadminning I learned in the 1980s is on its way out.

As someone who runs a very successful data business on a simple stack (php, cron, redis, mariadb), I definitely agree. We've avoided the latest trends/tools and just keep humming along while outperforming and outdelivering our competitors.

We're also revenue-funded so no outside VC pushing us to be flashy, but I will definitely admit it makes hiring difficult. Candidates see our stack as boring and bland, which we make up for that in comp, but for a lot of people that's not enough.

If you want to run a reliable, simple, and profitable business, keep your tech stack equally simple. If you want to appeal to VCs and have an easy time recruiting, you need to be cargo cult, even if it's not technically necessary.

> Candidates see our stack as boring and bland

I would say that there are probably a lot of developers who would be very happy to work on a non-buzzword stack, but the problem is that as a developer, it's extremely hard to know if your tech stack is the result of directed efforts to keep it simple, or if it's the haphazard result of some guy setting it up ten years ago when these technologies were hot.

I would be happy to work on a stack like that, but I can't deny that it seems somewhat career limiting long term.Especially as I am over 40 now. I will be seen as not keeping up to date.

(I certainly think I design and build better software than most people having done a few years of maintenance programming recently, people just create over complex monstrosities for what should be simple apps)

There is nothing "cargo cult" about realizing that PHP is just way more difficult to work with and maintain in any large or long-term project than basically any of the more modern "culty" languages (especially the functional ones, which focus on determinism/reliability/transparency, unlike, say, PHP which last I heard has flagging tests in its very own test suite, and does the same "complexity hiding" (read: brushing tech debt under the rug) that every OOP language with an ORM does.

That said, good on you for running a successful business well using tried-and-true tech, can't knock that!

Backwards compatibility.

It makes the language a bit of a mishmash between things that were popular 10 years ago, and whatever the new hotness is. And errors once made, will never really leave the language.

But the app that I built in php5 10 years ago is still running on 7.2 with a one character change.

Modern PHP's quite a bit better than the utter mess that was 4.0 or even the half-ugly 5.0. They've deprecated the worst of the misfeatures, especially by default. Now if only they'd adopted the HHVM/Hack Collections instead of the terrible arrays...

Hear hear. I lead a team that builds and manages critical emergency services infrastructure.

Our stack is pretty boring, but then it has to be running 99.999% of the time. Rather than wasting time chasing the latest flavour-of-the-month tool or framework we invest our time in plugging any kind of gap that could ever bring our service down.

We don't need people who are only looking to pad their resume with the hottest buzzwords, we look for people who want to make critical services run all that time, that rarely fail and when they do, they handle failure gracefully.

The number of devops/agile/SaaS style shops I have seen where the product falls over for several hours at a time is astounding, and it can often be attributed to rushed/inexperienced engineering or unnecessary complexity.

Lucky for them it's usually just the shareholders bottom line that is affected. If the services my team provides don't work, ambulances and fire engines are either not arriving fast enough or at all.

Good on you mate. I love this approach.

I'll be doing the same thing my self with a few products I'm developing with my brother.

"Keep it simple, keep it stable" is what I like to say.

I work in the same environment, and hiring is the only downside I see. Resume Driven Development and "Dev Sexy" has made it difficult to find developers who are willing to come on board, despite the sanity provided by simplicity & comp.

It will now, but how many people are going to be interested in my LAMP experience 5 or 10 years down the line? While everyone else has been working with the cloud/kubernetes/aws/gcp/serverless technologies.

I have never seen a VC care about what software stack you use. I did have one particularly geeky one asking me for advice on whether he should invest in MariaDB.

Your point about recruiting is spot-on, however. It's not that all candidates necessarily believe in the cargo cult, but they have their own career and employability to consider.

To me the biggest red flag there is the php. After developing with typed languages, a dynamic language is honestly a pain. Cron is easy to replace if needed.

Like I would feel much better if it was python, golang, java or C#. Javascript I feel like is the new PHP. Another issue is what I call the COBOL syndrome, where your career future isn't as great. You can still be a shop with a relatively good career future tech set that is old, but it has to be the 'right' old things unfortunately.

Do you at least use something like Hack to add types?

PHP has had optionally typed function parameters and return values since the late 5.x releases I think (current release is 7.3). Types are checked at runtime and throw TypeErrors if the declared types are violated. They can also be checked ahead of time by IDEs with code inspection such as PhpStorm.

The 7.4 release is adding typed class properties as well[1].

I maintain a ~75k LOC PHP codebase (using the Laravel framework), and we have almost never encountered type issues. The new style of PHP (and Javascript is heading in a similar direction tbh) is to write it almost like it's Java, but with the option to fall back to dynamic "magic" as needed. If you utilize the dynamic elements sparingly, and following widely understood conventions, the productivity-vs.-reliability tradeoff is highly favorable for many applications compared to languages like Java and C#.

P.S. I like Python, but I would argue PHP actually has a better story to tell about types these days. "While Python 3.6 gives you this syntax for declaring types, there’s absolutely nothing in Python itself yet that does anything with these type declarations..." [2]. Declaring types in a dynamic language only to have them ignored at runtime does not inspire much confidence.

1. https://laravel-news.com/php7-typed-properties

2. https://medium.com/@ageitgey/learn-how-to-use-static-type-ch...

Declaring types in a dynamic language only to have them ignored at runtime does not inspire much confidence.

Funny, I see it the other way; declaring types in a dynamic language yet only having them checked at runtime does not inspire much confidence. With mypy, you actually get static checking, so you're not dependent on your tests hitting the bug.

As an experienced developer I’m finding it harder to find companies who use “boring”, simple and stable solutions. Any chance you’re hiring remotely?

I really think that running a LAMP server for the average beginning developer these days would be just as complicated, maybe more complicated, than running a single deployment on Google Kubernetes Engine. You have to know about package managers and init systems and apache/nginx config files and keep track of security updates for your stack and rotate the logs so the hard drive doesn't fill up. If you already know how do this stuff in your sleep because you've done it for years, then yeah, don't fix what isn't broken. But if you're starting with no background, there's nothing inherently wrong with using a more advanced tool if that tool has good resources to get you started easily.

Just because there's more complexity in the entirety of the stack when running an orchestration system doesn't necessarily mean more complexity for the end user.

Side note - couldn't you make a similar argument about any kind of further abstraction? "Question for all you hackers out there - do you really need HTTP requests with their complicated headers and status codes and keepalive timeouts? I run several apps just sending plain text over TCP sockets and it works fine."

>Side note - couldn't you make a similar argument about any kind of further abstraction? "Question for all you hackers out there - do you really need HTTP requests with their complicated headers and status codes and keepalive timeouts? I run several apps just sending plain text over TCP sockets and it works fine."

No, because the end users already have an HTTP browser. Your analogy doesn't work because switching between k8s and lamp stacks is invisible to your users where dumping HTTP means you need dedicated clients.

maybe the end user is someone sending curl requests?

I think from the perspective of getting things deployed, you're probably right. Kubernetes really shines there.

I still think that a LAMP server is better for an average beginning developer because troubleshooting is significantly easier on a stock install. Stock installs of Kubernetes give you very few troubleshooting tools. I've had issues where CoreDNS stops responding, so some pods basically don't have DNS, and even figuring out which pod that traffic was going to is a nightmare.

My devs frequently struggle with things that I would consider basic in Kubernetes, despite having worked with it for a year or so. Things like creating an ingress, a service and a deployment that can all work together are still a struggle, and Kubernetes isn't very helpful when those things don't play nicely together. Just today had to work through with someone who created an ingress and service correctly, but forget to open a containerPort, which caused the service to decide there were no valid backends and to route all the traffic to the default backend.

It's probably mostly the network, but Kubernetes overlay network can make simple troubleshooting very difficult.

I would say that it's still not a fully mature technology. As gaps become known new things emerge to plug those gaps. Istio springs to mind here in terms of making the networking and monitoring side of things easier.

The trend I see is that smaller and smaller teams are becoming capable of bigger and bigger things and when it comes to smaller apps that don't need it we're trading some complexity familiar to a large group of people (Kubernetes) for complexity known to only a handful or single individual (Bob's idempotent script for distro $X that only he knows the intricacies of).

I consider it fairly remarkable that a single developer today could accomplish, in terms of building and operating a system, what it would have taken a large team of specialists to do even 5 ~ 6 years ago let alone 10 years ago.

Now the OPs point is "you don't probably need it" and sure. Maybe you don't. But I would say watch how it shifts the economics of software development in the broader sense and especially over the next few years as the technologies mature.

It would be nice if kube told me why my containers crash in the event log. Right now it just shows ‘crashed’.

>I really think that running a LAMP server for the average beginning developer these days would be just as complicated, maybe more complicated, than running a single deployment on Google Kubernetes Engine.

Back when I first started doing web dev I went from knowing nothing about server setups or Unix (i.e. running off managed hosting) to a reasonably secure FreeBSD server with a working content management system in 3 days. This included installing the OS. The same FAMP setup (with modifications and updates, obviously) continued to work perfectly fine for the next decade.

Re-edit: In the original version of the post I drew a parallel with several teams at my previous job "figuring out" AWS Lambda for several month, and stumbling over gotchas, multiple ways of doing everything and a myriad of conflicting tools. Since there is a reply to that statement, I guess I will add this note.

I believe you believed that :-) But realistically, after over a decade of doing effectively DevOps, I'm still learning about mistakes I made before. In 3 days you may get something running and learn basics, but likely it's a false sense is security...

Containers are a mechanism to run your old machine[s], but with a reproducible setup script. A machine packed in a container happens to also run on your dev/CI environments. There isn't much logical difference between a physical machine, a VM and a container [0].

Serverless offers a large surface of APIs, some of them proprietary, tangled in an ever evolving dependency hell.

Historical note: Google Cloud started serverless with AppEngine, then focused on GCE [and later GKE] _because_ serverless was hard and AWS was eating their launch with VMs.

[0] For example, we can argue about security isolation issues in containers vs VMs. Eventually this will become a moot point as technology advances far enough that we can run each container inside a hardware-backed VM.

>Containers are a mechanism to run your old machine[s], but with a reproducible setup script.

Well, exactly. There is not much to them, conceptually. So why does orchestration has to be so complicated? https://www.influxdata.com/blog/will-kubernetes-collapse-und...

Also, serverless should be simpler. But it's not, like you said. That's my point. There is way too much accidental complexity bundled with these technologies.

Containers are conceptually simple. Products like Kubernetes, Docker, etc which need to be sold or have a for-profit motive, on the other hand, need to be complex so the supporting company can sell software and support contracts.

The two purposes are directly opposed to each other.

> need to be complex so the supporting company can sell software and support contracts.

I think it's more likely due to the need to solve everybody in the world's use case. Yours is just a convenient (for some) side-effect.

It's not accidental complexity, it's economic complexity: the complexity required for the service to have had the extra performance and features relative to its predecessor that lead to its mass adoption.

Let me give just one example, of the replacement of VMs with containers:

VMs have static allocations of CPU/memory/disk/etc. Therefore, you don't need to ask "where" you're running a VM, The VM is some size, therefore it finds a free slot of that size on a hypervisor cluster, and stays there. Simple!

Containers are like VMs if VMs were only the size (in CPU, memory, disk-space, etc.) that they were actively using. Which means you can potentially pack lots of containers—i.e. heterogeneous workloads—onto one container-hypervisor. So you "need" to introduce rules about how to do so, if you want to take advantage of that.

And, because a container sees a filesystem (where a VM just sees a block device), you "need" to give containers rules about how to share a hypervisor filesystem. So you "need" a concept of volume mounts, rather than just a concept of disk targets on a SAN, if you want to take advantage of that.

These "needs" aren't really needs, if you're okay with your container having the exact same CPU/memory/disk overhead that a VM would. That's what services like Elastic Beanstalk get you: the ability to forget about those details.

But in return, with such services, you only get one container running per VM. So you don't gain any enterprise-y advantages on axes like "marginal per-workload cost-savings" or "time to complete rolling upgrade" versus just using VMs. And so—if everyone did things this way—nobody would have ever switched from VMs to containers.

The fact that containers do have widespread adoption, implies that container advocates managed to convince ops people to do something in a new and more complex way; and that this new complexity lead to advantages for the people who adopted it.

I call this complexity "economic", because it's the result of a https://en.wikipedia.org/wiki/Race_to_the_bottom (of adding complexity to squeak out higher efficiency at scale) which costs more and more in dev-time per marginal gain in performance, but where we can't opt out, because then we're outcompeted (in terms of having lower costs) by platforms that are willing to go all-in on the increased complexity.

Yeah! If you just encode your plain text as JSON, and then add something like the resource you are operating on, and the action you want to take, that’s much easier than the complicated HTTP.

I tried learning web dev with a LAMP stack and it was frustrating. I hate learning systems that try to do everything for you because you get way too many boxes with question marks over them in your understanding.

Its different after you know how things work, but to start with I really appreciated nc and node http servers

I realize there is a need for multi-server applications with automated deployment and scaling. However, the accidental complexity of serverless setups and container orchestration tools is just off the charts. When reading these articles I get roughly the same feeling I got when reading J2EE articles back when J2EE was "the future" and "the only way to build scalable infrastructure".

Pepole say that serverless keeps thing simpler. In reality its just moving the complexity to different areas (devops as opposed to application complexity). We have been writing applications for longer, so it should be easier to keep the complexity lower if you keep the logic there. Are there similar well known patterns / best practices / ways of structuring devops that have been developed for applications over the years?

It's not just about scaling. That seems to be the only thing people talk about because it sounds sexy but the reality is about operations.

Kubernetes makes deployments, rolling upgrades, monitoring, load balancing, logging, restarts, and other ops very easy. It can be as simple as a 1-line command to run a container or several YAML files to run complex applications and even databases. Once you become familiar with the options, you tend to think of running all software that way and it becomes really easy to deploy and test anything.

So yes, for personal projects a single server with SSH/Docker is fine, but any business can save time and IT overhead with Kubernetes. Considering how easy the clouds have made it to spin up a cluster, it's a great trade-off for most companies.

Exactly. It solves some of the most important problems that come up of working with microservice based architectures, and establishes mature patterns around the ability for multiple developer teams to update and scale each piece of a distributed application.

Also, what do you do when your digital ocean droplets needs hundreds or thousands of customers on it? Maybe each customer needs it's own database (well in my case they do), configuration, storage requirements, and multi-node requirements. How do you keep track of all that, how do you automatate it and QUICKLY recover in failure scenarios. You need to be able to deal with failure on a node, or if there's a bad actor you need to be able to move them off easily without downtime or affecting other customers, automatically, seamlessly. You need to be able see an overview of your resources across all nodes and where apps are placed and have something decide if the hardware your new container is being added to can handle another JVM or whatever. For cost effectiveness, you want to be able to overcommit resources, so you want containers. You want those to translate to other platforms, AWS, google, azure, on-prem. You have a single declarative language that works anywhere you can deploy a k8s cluster. You need to deal with growth, good patterns for rolling back and updating versions of parts of the stack. You want all of your deployments to be declarative and to be able to tightly control the the options for each one and get back to where you were.

I agree that it doesn't make sense for everything, and it requires fundamental understanding of linux and software before it even makes sense to try shoehorning onto, but it solves real world problems for many people, it's not just a hype thing. I would say docker itself was more of a hype thing than k8s, the maturity and features of k8s and other orchestration systems that came out of the docker model is there for a reason, because it solves all of the real-world problems people couldn't solve with vanilla docker without tons of custom scripting and hacky workarounds. Docker solved the big problem by providing the isolating environments for each app and splitting things out into microservices that way, without having to commit a full statically resourced VM or bare metal per service. K8s solves all of the other problems that came out of that (pods, stateful sets, init containers, jobs, cronjobs, service definitions, deployments, volume claims.)


FreeBSD, Apache, Python - the FAP stack

Ok. I'll see myself out.

I laughed ;-)

Me too, surprised this hasn't been down-voted to hell on here.

Stop thinking of Kubernetes as an easy way to scale ops for a single app, and start thinking of Kubernetes as an easy way to scale ops for non-trivial amount n apps.

If you're a startup with a monolith then sure, you probably don't need Kubernetes. If you're not using Heroku/GAE/etc. then you generate a machine image from your app, deploy it behind a load balancer (start with two servers), and use some managed database for the backend. That's pretty simple. You can scale development without scaling the size of your ops team (1-2 people, only need two if you're trying to avoid bus factor 1), at least until you need to outscale a monolith.

If you need to run a bunch of applications, made by a bunch of different teams (let alone when they don't work for you - i.e. an off-the-shelf product from a vendor), then using a managed Kubernetes provider makes this relatively simple without needing more people. If you try to do that without containers and orchestration, and want to keep a rapid pace of deployment, and not hire tons more people, you will go crazy.

The reliability and performance story for Hacker News is not great, and that's despite the fact that its design has lots of simplifying assumptions. I wouldn't call HN a success story for the "just drop it on a server" approach.

Of course, HN is a kind of art project, and its scaling and performance goals are not typical of most applications.

I think you're right - at least 90% of servers on the web would be fine with a couple of instances at most backed by a decent db. It can get more compex depending on your resiliance requirements but it really doesn't have to be.

I guess I run a CPG stack - Coreos, Postgresql, Go. Don't bother with containers as Go produces one binary which can be run under systemd. It is far simpler than kubernetes and the only real reason for other servers is redundancy. The only bit of complexity is I usually run the db servers as a separate instance or use a managed service. You can go a long way with very boring tech. I've run a little HN clone written in go on one $5 digital ocean droplet for years - it handles moderate traffic spikes with little effort.

I think of these in this way: 99% of the apps are developed by the developers who are not in the top 1%. The cheap access to computing power has led to a growth of developers beyond the highly skilled ones who can milk out everything available in a less powerful computer. I'd like to believe we are at an Electron development phase where we just want to ship as much as possible easily without worrying about hiring great talent (And yeah I hate that its inefficient in terms of memory usage). This has led to the explosion of so many frameworks that does a lot of things easily which requires such complex devops pipeline.

I personally use Docker combined with a $5 droplet on Digital Ocean. This makes it easy to spin up multiple applications and sites without worrying about conflicting dependencies, and docker-compose gives me most of the benefits of orchestration tools (e.g. Kubernetes) that actually matter for my small scale usage.

Also Traefik makes a nice load balancer for this uage

> docker-compose gives me most of the benefits of orchestration

I feel this is a very unappreciated feature of docker-compose. I've gotten pretty far with setting restart: always, baking a machine image, using cloud scaling and load balancers.

Interesting! When you rollout a new version, how do you coordinate docker-compose and Traefik to avoid any user-facing downtime (no 502)?

The simplest answer is I am the only user currently so it doesn't matter. Even with users, a short down time to bring down the old images and bring up the new images should be manageable. If I ever got to a point where that wasn't sufficient then I would consider that project successful enough to merit investment into zero downtime strategies and probably its own VPS as well.

From a more technical side, I haven't seen an issue with Traefik when I take a container down and back up again. The only delay I would anticipate is when you bring a new service online for the first time and Traefik detects it and configures + fetches LetsEncrypt certs for it.


For reference, this runs on a $5 AWS instance:


The database is $600 per month, but that data runs five different websites (and it's a few hundred Gb of data).

EDIT: for those mentioning the 502 gateway error, it does auto scales - Now it's costing more per month, at least temporarily.

>502 Bad Gateway

maybe a $5 instance isn't enough

Not a good example as it is currently 502'ing: https://i.imgur.com/sU8Zn5v.png

$5 AWS instance ? aren't they all substantially more than that ?

Does hacker news really run on one server? What if server goes down?

I've always though high availability was the more important reason for multiple servers, rather than performance.

Even if you have only two paying customers, they are probably paying for the right to hit your website / service 24/7.

What if hacker news goes down, more work gets done for the day?

It’s really common to over estimate the cost of being down, while underestimating the costs of resiliency. And in the end if you can’t fully afford the resiliency, you end up with a system that’s more complicated and thus less stable than it could have been had you just accepted a tiny bit of risk.

extremely valid point. The initial setup isn't the challenge, it's the endless tweaks to make it fault tolerant. For anyone looking into k8s, take time to research best practices for monitoring and readiness probes.

> What if server goes down?

Productivity of Y Combinator startups goes up. So YC benefits either way.

An interesting case study of an app at scale is Stack Overflow. Rather more than one server, but rather less than you might expect.

Edit: fixed url


Key point:

"The primary reason the utilization is so low is efficient code."

One application server, yes. HN is fronted by Cloudflare for CDN + DDoS protection, which of course is a lot more than one server.

That's why if you get a particularly long thread (1k+ comments), admins will beg people to log out so that the responses can be served from the CDN cache.

Example: American 2016 presidential election https://news.ycombinator.com/item?id=12909752 (1,700 comments)

Not Cloudflare anymore. Traffic goes to somewhere in San Diego from Europe, which it wouldn't if it was Cloudflare (different IP range too)

HN hasn't been fronted by Cloudflare since July.

Hm, interesting. I noticed HN started responding to HEAD requests with 405s recently; perhaps that is the cause.

Why did they stop using Cloudflare?

It was part of their networking rework. Presumably to improve stability of the website. I've also been told that it was a big step towards enabling IPv6 for HN.

There are other parts of Y Combinator that still use Cloudflare though.

Well, there's another important question: What percentage of all services need high availability? Stetson-Harrison method shows that it's less than 5%.

Need is determined by the customer. You might be able to explain to them that they don't need it on their way to a competitor. May not be fair or rational, but redundancy and its peace-of-mind has real benefits in most systems compared to cost.

Has hacker news ever gone down? I cant remember it ever going dkwnt but I assume it has to have at some point.

I've seen it happen at least once in the few years that I've been on Hacker News, see https://news.ycombinator.com/item?id=17228704

I got a server can't respond to your request type message this morning. OK, the server wasn't down, but it was loaded enough that it couldn't serve me. A page refresh and it was working.

Here's the official status twitter: https://twitter.com/hnstatus

You probably don't need kubernetes.

Lets be fair, it offers:-

> Orchestration of block storage

> Resource Management

> Rolling Deploys

> Cloud provider agnostic APIs*

If you don't need any of these things, and your stack fits on a single server or two, and you aren't already familiar with it, I'm not sure why you bother other than an interest.

That said, there's a world of companies that aren't FAANG, ub3r and Basecamp, and many of those paying reasonble sums of money have more complicated and resource intensive requirements that don't fit on a single server.

Government Departments, Retail Companies and Banks all likely have a number of different software development projects where giving a number of developers API access to a platform that offers the above advatanges is, in my opinion, a good thing. Once you get to FAANG level, who knows whether kube itself will actually help or hinder at that level.

* Personally I'd rather use the kube APIs than talking to any of the cloud providers directly. I imagine that's somewhat personal preference and somewhat because I've been able to easily run it in my basement.

*2 Namespaces also makes making more environments for CI/CD easier, so as soon as you have a team of developers and you want to do that sort of thing, it also makes sense. Not so much for a loan developer and his server.

There is also overhead in the form of instances to run and maintain the backend datastore and control plane components. You should already be at a certain scale before considering kubernetes.

This is free on GKE and I believe AKS, but of course if you're doing this yourself for some reason, you need to compare it with the alternatives.

Spot on, friend.

So recently I started writing a simple web application for my family. They send emails to each other with gift wish lists in them and we all have to juggle those emails around. I figured some products would exist already to solve this problem, but I wanted to make my own.

When it came time to make it I thought: "This has to be a REST API with a JS front end" and then further down the line, "Man I should use Flutter and only make it a mobile app!" I had other thoughts about making it Serverless and doing $thisCoolThing and using $thatNewTech. In the end nothing got done at all.

Fast forward to today and it's a monolith Go application that renders Bootstrap 4 templates server-side, serves some static CSS directly, sits on a single server (DigitalOcean) and uses a single PostgreSQL instance (on the same server). The Bootstrap 4 CSS and JS come from their CDN.

I made the technology simpler and the job got done. It's an MVP with basic database backups in place, using Docker to deploy the app. It just works.

Lessons for me from this:

* Server-side template rendering is perfectly fine and actually easier, frankly * JS can still be used client-side to improve the experience without replacing the above or making the entire rendering process client-side * Although Go compiles to a single static binary I still need other assets, so it went into a Docker container for the added security benefits not to mention portability * Serverless is nice, but unless it's replaced the above day-to-day, there's always a steep learning curve around something you haven't done with it yet, but need * Picking the latest and greatest tech tends to stagnate progress or halt it entirely, in most cases * An a software MVP needs an MVP infrastructure to go with it

Just my thoughts.

"I figured some products would exist already to solve this problem, but I wanted to make my own."

But why? You could've used so many different products. You could've even used Google Docs.

There's a bit more to the story than simply solving a problem. I have other plans for the software and we have ideas about how we want to move it forward.

One of the key features is being able to "tag" an item on someone's wish list as "bought" or "buying". This allows others viewing the list to know that item has been taken. But there's also a requirement that the original author of the list/item cannot see that it has been bought otherwise there's no element of surprise for them come time to open the gift(s) :-)

Spreadsheets don't enable that privacy/secrecy.

Still, I highly doubt that there are not any apps that do this already. I mean, my family uses Amazon wish lists and set it so that you can't see when stuff is purchased.

Like I said, it goes beyond just solving the problem. It's a learning exercise as well as a solution.

Have you never wondered what projects you can write to help you learn X or Y?

I'm interested in something like this - is the code publicly available?

> so it went into a Docker container for the added security benefits

Which are? Last I heard containers (with baked-in dependencies) were generally considered very bad for security because the dependencies never get updated.

Is there a short downtime when you deploy new versions of the Go application? Is the Go application directly exposed to the Internet?

I did a programming project for a job interview recently at a company called Willowtree that makes iOS and Android apps for other companies.

It was a pretty simple project, basically wrap a rest API around some JSON data provided to you.

I ended up deploying mine to Google Cloud Platform onto a VM running Ubuntu and Apache, and they seemed rather concerned that I took that approach instead of leveraging some kind of containerization or PaaS approach.

My API definitely had problems, as I don’t have much back end experience, but I found it strange that they would look down upon deploying to a cloud VM. It doesn’t seem like it was that long ago that a VM hosted on AWS or Digital Ocean was the latest and greatest and it seemed like a logical choice for something that would only ever be used by about five people.

You prob dodged a bullet then. We give out take home exercises (not my idea but whatever), and we tell the client we don't care which config mgmt you use just pick something you're comfortable with. We use TF+Ansible but we would never frown upon work that's something using Salt, Chef or CFEngine.

They do not, I run tens of low-traffic projects very successfully on a $10/mo Hetzner server on Dokku. Dokku is amazing and so is Hetzner, I don't know why people always go for the high-scalability, expensive options just to end up with 0 utilization.

Because my company is risk-averse, and perfectly happy to drop $1000 a month for a ridiculously overprovisioned database instance just to ensure an issue with the database will never cause their contracts to be lost.

if you have zero utilization, then you should scale down or lower your instance type until you are optimized for performance and cost. Which is easier to do when you are in an environment such as gce or aws.

There's a minimum when you have deployed each one of your hundred microservices to a server, though.

On a service that has been up for almost 20 years, same code base, thousands of daily users, the first server was constantly on 100% CPU. The second server, averaged around 10% CPU with lots of spikes. The third server, now average below 1% CPU usage. Next time I need to upgrade I will probably get a "NUC", or a smarphone, or something even smaller. But it's not only CPU's that has gotten better. The first server also maxed out the bandwidth! And now, although with less users, the bandwidth usage is less then 1% Started out on 0.5Mbit DSL, and it's now on a Gbit fiber.

> If it's high, is it because you are getting 100,000 requests per second, or is it the frameworks you cargo-culted in?

Mine's high because our business model involves blockchain stuff, and

1. blockchain nodes are CPU+memory+disk hogs;

2. ETL pipelines that feed historical data in from blockchains produce billions of events in their catch-up phase. (And we're constantly re-running the catch-up phase as we change parameters.)

Sadly, we need several fast servers even without any traffic :/

Initially I was skeptical as well. One server in a colocation will handle enough traffic until you can afford to hire all the people to make you web scale. But then I started playing with the various tools and seeing how people used them, and it totally changed my view.

The key point is that many of the new technologies in operations are about simplicity rather than speed. Standing up a stack in AWS can be flipped on and off like a light switch, and all the configuration steps can be much more easily automated/shared/updated/documented etc...

It's not about any of these technologies being more efficient; it is about spending more in order to abstract away many of the headaches that slow down development.

Certainly there are some people who are prematurely planning for a deluge of traffic and spending waayyy too many engineering resources on a 'web scale' stack, but that's not the majority.

This is a really interesting comment, thanks

I think for smaller use cases its more about high availability than load balancing.

The load average on my kubernetes cluster is actually around 3-4 without it even doing anything.

There’s a bunch of apps running in there, but nothing that would justify the load.

It’s also generating roughly 20 log lines per second.

I’m really not sure what it’s doing...

> I manage dozens of apps for thousands of users. The apps are all on one server, its load average around 0.1.

If you're at this scale you can do whatever you want. Most of the stuff I've made has been with simple building blocks like you've described, maybe thrown in with some caches and a load balancer.

Although I've worked with other teams who really did have the high scale request flows that require you think about using a different architecture. Even so, K8s is not the end game and you can make something work even just extending the LAMP stack.

I think that kub and, generally, cloud providers have allowed for more ambitious projects to be generally accessible.

My side project is intended to handle > 1 billion events per day, with fairly low latency. That's well over 10k events per second.

I doubt I could do this easily on a single box, and I wouldn't really want to try. Why constrain myself that way? Is it worth just doing this the standard LAMP way?

More and more problems are available to be solved using commodity systems, so we have more and more people solving those problems with these new systems.

Depends on the box. Per core performance is probably your metric there if the app can utilize multiple cores.

Hacker news running on a single server is a very bad precedent. Wish people running the show address it quickly since its being used as a bad example.

When building a business you should take care of having an environment which is resilient. I agree it's not for everyone. But its quite essential when you have a huge customer base and care about unpleasant experience. If someone is running an important business and leaving it to chance - its just pure arrogance or gross incompetence.

Not everyone uses K8s for webapps. You would be surprised at the level of enterprise penetration of K8s. Those enterprises do boring stuff like datawarehousing etc.

One of the more interesting use cases I have read in recent memory: Chick-fil-A used it to set up a bare metal edge compute network between all it's restaurants

The how: https://medium.com/@cfatechblog/bare-metal-k8s-clustering-at...

The why: https://medium.com/@cfatechblog/edge-computing-at-chick-fil-...

> Some day I would like a powwow with all you hackers about whether 99% of apps need more than [...]

Close. But I also need it HA w/ automatic failover, auto-SSL certs (meaning I might need to give my DNS provider creds depending upon LE approach), notifications on outages, easy viewing of logs past and present, easy metric viewing, automatic backups, and updates that are easy for me to sign off on to then run. I'll do the plugging on the app side (that is exporting metrics, logs, etc). And no vendor-specific solutions (even if they are repackaged common components like RDS is for Postgres), I should be able to run on a couple VMs on my laptop if I want and it be the exact same. And I may want to add an MQ/stream (e.g. Kafka), in-mem DB (e.g. Redis), etc later and still want log aggregation, metrics, backups, etc.

Really, that's not asking too much but it's definitely more than LAMP. We need a pithy name for this startup-in-a-box (again, that's NOT a PaaS, but a self-hosted management on an existing set of servers). Nobody wants to fumble w/ Ansible/Puppet/Salt/Chef/whatever all over the place or hire an ops guy, and people don't want to use vendor-specific solutions.

I agree with the "you only need this"...but we need just a bit more to handle outages and auditing.

It's convenient to be able to trivially create new production-like environments. Great for reproducing bugs or simulating deploys or running demos. My company's setup and scale doesn't necessitate kubernetes, but I still find it useful. It was fairly straightforward to set up.

>I manage dozens of apps for thousands of users. The apps are all on one server,

How are these backed up?

And how do they fail over when the server dies (at least the non-user-app part like their DBs)?

Yeah, you probably don't. And not only that, but it probably makes your life harder. I've interviewed for a tech lead position at a company working with freelancers and I'm pretty sure the reason they ended up rejecting me was that I mentioned the technical interviewer that I think containers, container infrastructures (like Kubernetes) and even cloud infrastructure is being overused/used without giving too much thought about it as if it came free (in the sense of setup and operating complexity). Too bad the interviewer started rambling about how he was into Kubernetes those days :). (Actually, this was the most technical part of the interview.)

I'm mostly working with startups and small companies creating MVPs and that was their client base too. Now most of the time these are just building CRUD apps, most of the time these apps don't see heavy usage for years (maybe never). Developers love technology, love to play with new(ish) things so quite a few of us will prefer using whatever is new and hip for the next project. Now it's containers and microservices. And it feels safe, because done right, these will give you scalability. And once you convince the client/boss that you need it it's unlikely that anyone will come back in a year and say: hey, it seems that we'll never need this thing that made the development $X more expensive. (Partly because they won't know the amount.) So actually politically it is the safe choice. But professionally and cost wise it's usually worse. It's a lot better to have to transition after seeing the need (preferably from the projected growth numbers). At least you minimize the expected value of the costs (bacause YAGNI).

I once got an interview from a company in the container space because one of their exec read an article I published talking about the trouble with container systems[1]. (Really good talk/interview, but I ended up not moving forward because I didn't want to move back to the west coast).

I've been in smaller shops that wasted a lot of time on K8s stuff and fell behind on their timeline. If you want to run k8s, DC/OS, etc. you need a lot of ramp up time and at least 4 ~ 8 dedicate staff members. I've talked to other startups that preferred running Nomad instead due to setup complexity.

I doubt k8s will go the way of Open Stack since it does actually work, but I do think we'll see it limited to big-end enterprise systems and smaller startups will push forward with other, easier to build up clustering technologies.

[1]: https://penguindreams.org/blog/my-love-hate-relationship-wit...

I haven't been following OpenStack closely enough to understand the way you feel that it's gone or what I assume is an implication that it failed to work (in which sense?).

4-8 dedicate staff members to run k8s? seriously, How did you come up with that number?

1) You can run k8s hosted on Google, DigitalOcean with zero effort.

2) I built a k8s cluster in 3 days with zero experience after spending a week playing with minikube, reading the docs are kuberentes.io

Firing up minikube to toy with Kubernetes and switching your entire software stack to run on top of a production-grade Kubernetes cluster are two very different things.

> I built a k8s cluster in 3 days with zero experience after spending a week playing with minikube, reading the docs are kuberentes.io

I'm pretty sure I can do it in an afternoon from scripts on github. But if something goes wrong all bets are off. Just getting something setup is not building a competence around it.

How well would you, with zero experience managing and deploying a stack, do the same in an orthodox LAMP setting?

Honestly, the answer is only relevant if you actually have zero experience deploying any stack, or orthodox LAMP stack.

If you have no experience with (stack Z), then you will have to go out and get some experience before opting to use (stack Z). The problem is, many people hear this and stop there.

While there are some barriers to experience and production-readiness, they are not insurmountable, and there may be a pot of gold at the end of the rainbow. There is a cost for everything. Sometimes it's an opportunity cost. (Sometimes the cost can also come from not acting.)

built != run.

not sure if I agree with OP on precise figures, but, if you're dealing with:

- actually federated k8 infrastructure (distributed etcd, etc)

- enough system load that k8 is actually worthwhile and not 'cool'

- user / dev requests

- whatever background IT projects are going on (updates / new nodes, existing upgrades, testing new configs, etc)

and you want:

- 24x7 operations (or even stringent 8x5)

- people to be able to use the restroom and take vacations and still have 1-2 sets of hands available

1 person doesn't quite cut it, because (s)he will quit eventually, and you will have nooone.

dev-centric mentalities of 'it compiled, ship it, i made a new futuristic fancy feature aren't i amazeballs' often don't take into account real operational concerns, which i think is precisely the point in critiquing k8s here

And you feel like you understand the warts and pitfalls, suitable to keeping a mission-critical service running once the circuit breakers start to trip, based on 3-8 days of experimentation?

well, not the op, but i googled 'kindly pls to fix a kubernetes software on the RHEL servers' and I got this in a forum post:

    $ sudo apt-get install fix-muh-k8s && sudo fix-muh-k8s || reboot
so I think i'm good to go, thankyouverymuch.

My friend is trying his luck with his own start up, they have yet to launch a 1.0 of product but the CTO implemented K8s citing scalability. I legit laughed at that statement.

My friend has a startup and a single person got a k8s cluster running on aws with kops+GPUs in a couple weeks. He loves it. The people who are running it successfully don't come on here to complain.

The point is they have been coding away for more than a year without having getting out of beta. The CTO spent 3 weeks learning k8s and setting it up, those 3 weeks which could have been spent on features or ironing out bugs in the product and releasing an actual 1.0 of the product. You can always work on k8s once you have paying customers and have actual demand. Are you telling me you need k8s for not even 10 customers in beta?

For some use cases (GPU) it's considerably less expensive to spend the time learning kubernetes.

Yes you are right, but this is a glorified CRUD app.

> a single person got a k8s cluster running on aws with kops+GPUs in a couple weeks

In the context of your GP's comment:

1. How long would it have taken that one person (or the startup) to get started with AWS+GPUs without k8s?

2. How much effort would it take that person to debug an issue with if/when one springs up?

3. What happens when the single person goes on leave?

Probably a lot longer. Do you know of any other orchestrator that supported GPU scheduling natively 6 months ago?

Having installed Openstack many times before, often having issues with it running:

> t k8s will go the way of Open Stack since it does actually work,

_ohh shit__!

Same with me, never had any interviewer actually give me an actual real rebuttal, other than "but it scales!!".

I love software, but I really get tired of the blind cargo-cult culture of most of the industry.

I'm not an interviewer, but:

* Version controlled deployments of your application via K8s state files * Once an application is live, you can easily deploy a new version and have traffic drained over for you * Recovery of applications (Pods) that fall over * Automatic replacement of your application when a piece of hardware fails (K8s will see the Pod has died and bring it back up elsewhere) * You can use labels and other mechanisms to "slice up" your infrastructure for specific work loads * The underlaying hardware can be virtually anything: a high-end physical server, an EC2 instance or a RaspberryPi -- you just define the resources available and what your application needs, that it's placed on the right hardware for you * With a few Pods setup, you can have all the hosts and Pods on the cluster report their CPU, memory, disk, network, and other metrics, straight into a Prometheus instance and have Grafana dashboards up nearly instantly, for free (in virtually in senses of the word) * New Pods are automatically added to the above setup, too * Integration with Cloud providers (so for example creating a service endpoint and saying you want it to be public results in K8s provisioning a public facing load balancer and IP for you and managing that...) * Much more..

This is all instantly available straight out of the box within 15 minutes with Kops on AWS.

I'm not arguing that K8s is automatically the correct answer, but to assume it's not an answer at all based on the feedback from interviews is a poor reason to dismiss it.

It does have a place and it doesn't have to be hard.

Hopefully I've been constructive here.

You've summarized it really well, been using k8s for the past year on a large microservice infra and it's been a pleasant experience. Most problems were flaws in the way our cluster was setup or our misunderstanding of the config options k8s provides.

> I'm pretty sure the reason they ended up rejecting me was that I mentioned the technical interviewer that I think containers, container infrastructures (like Kubernetes) and even cloud infrastructure is being overused/used without giving too much thought about it as if it came free (in the sense of setup and operating complexity).

I'd like to understand your thoughts more on why you believe cloud infrastructure is being "overused/used without giving too much thought ..." and more specifically, what the other options are.

I've come from a background of racking physical servers, plugging them into a network and having a PXE process install the OS for the client. It took a day to provision a single server in an enterprise hosting environment. It was mostly automated.

Speaking strictly "in the sense of setup and operating complexity", I'd love to know your thoughts on how dedicated physical servers in a local DC can outperform cloud based infrastructure in terms of (vast) availability and per-second billing. I don't think you could and I'd even be willing to pay for us to do an experiment: you call your local DC and have 30 high-end servers provisioned faster than me using the command-line.

You also put "... containers, container infrastructures (like Kubernetes) ..." under the same banner. I'd like to address this also, but to be fair and honest, I agree that Kubernetes is heavily overused and so I won't address it here. I'm mainly interested in how you consider containers to be overused given their simple as a concept and equally easy to get in place.

Put another way: in what way have you seen containers being abused? I want to avoid doing that my self and would love your thoughts on the matter.

To continue, if we take a rack full of high-end physical, dedicated servers and we want to deploy a Ruby Rails application (a very powerful, common software stack), how would you sell me a bare mental, direct-to-OS deployment of a Rails application versus using containers to deploy the same application?

Two of the biggest benefits to containers that make me put the effort into deploying them is portability and security. It's one "box" you have to logistically ship to a system and one command to open it and have its services supplied to the network. If the box is hacked due to an exploit, the hacker is trapped in the box and isn't roaming around the host server's file system looking for credit card details.

There's a good reason Discourse, for example, only support Docker has a means of deploying their application: it makes it easier.

> Now it's containers and microservices.

Yeah, microservices have been blown way out of proportion in our industry. They're amazing and great when you're Netflix, Facebook, Google, or Amazon, but there are only four companies that are that big and I just named them.

I may have been sloppy to use the cloud infrastructure expression. It can mean different things to different people. Sure thing, you want to use at least VPS-s, from the very beginning. Maybe even blob storage as/if needed. But probably that's it at the beginning.

> Put another way: in what way have you seen containers being abused?

Well, I wouldn't say abused, but they aren't really necessary a lot of time. E.g. if we talk about a small company with a single product (which, the developers being sane, is a 'monolith' as opposed to being broken into microservices) then containerizing the thing really adds nothing. The server will just have this one component anyway. And even if not, you can install multiple apps on the same server easily. You're unlikely to have dependency conflicts even if you have multiple apps as we're talking about a small org, small software and of course whatever platform you use (JVM, Node, Ruby, Python) they all come with solutions for this (jars, npm, gems, virtualenv). The only problem you may have if apps (probably through their dependencies) depend on conflicting OS level packages. Then you need (or better off with) containers.

> To continue, if we take a rack full of high-end physical, dedicated servers and we want to deploy a Ruby Rails application (a very powerful, common software stack), how would you sell me a bare mental, direct-to-OS deployment of a Rails application versus using containers to deploy the same application?

So we use VPSs. Which kind of solves the issue in itself. Also, the context I'm talking about is the small company/startup hosting their own SaaS solution. Yes, if you want to install 3rd party apps, containers are pretty convenient. I remember wrestling quite a lot with installing GitLab, RedMine and Sentry locally 4-5 years ago. (Not sure about the problems with the last one.) Wasn't fun.

On security: I'm far from being an expert but I do know that LXC (the containerization solution docker uses) is not really meant for security purposes. A kernel exploit is enough to break out of them. See e.g.: https://security.stackexchange.com/questions/169642/what-mak... . So if your container is compromised, then your VPS/server is compromised. Which means that you probably don't want to deploy these 3rd party apps on the same VPSs as your SaaS. But that kind of goes without saying.

> ... if we talk about a small company with a single product (which, the developers being sane, is a 'monolith' as opposed to being broken into microservices) then containerizing the thing really adds nothing.

I totally agree with the monolith thing. Did that my self recently too. I don't agree about it not being helpful to containerise them, though.

I've containerised a recent Go application I wrote. All it contains is the binary, an assets/ directory for CSS, and a templates/ directory for the HTML server-side templates. Deploying a new version consists of 'make docker-push' followed by 'make docker-run' on the remote server(s). All the Makefile is wrapping is Docker build, push, pull, rm, and run commands. Hardly taxing work.

It's also much easier to version control components when they're all placed into a neat and tidy "box" (image.) And because changes are just diffs of the filesystem, it's absolutely fine having 5,000 versions of your application because like branching in Git, it's cheap.

An example of this is wanting to change a library your application is using and trying it out. Super easy: change the lib', package it into an image with a specific version for that "product", deploy it, it fails (let's say), and you redeploy the previous image and you're back up.

Not to sound too arrogant or obnoxious, but anything other than the above is creating more work for oneself for no reason at all.

> So we use VPSs. Which kind of solves the issue in itself.

Do you pay per VPS? Sounds expensive. Would it not be cheaper and easier to have a "cluster" of VPS and run Docker on them, containerising your application? No need for K8s or any kind of orchestration, but using VMs as a "container" (like using AMIs for example) seems like a heavy, expensive option? I could be wrong.

I think you're solving the problems containers do using a heavier, weightier solution that's also slower and more expensive. Again could be wrong (and happy to be proven so)

> A kernel exploit is enough to break out of them

It would have to be an exploit that gets you a shell or the ability to execute code on the host. They're quite rare, I believe.

> I've containerised a recent Go application I wrote. All it contains is the binary, an assets/ directory for CSS ...

So basically you're using docker in place of tar here really. Not saying it's not convenient but it doesn't give you anything above the tools we already had before. You create an installation package locally (it could be a deb, a tar file or even just a local directory) and then transfer it to the deployment host. Yes, that's how you do deploy :). You still need the same amount of commands to copy the files into the target file system (in this case docker, but again could be a tar or a local dir). Then you use a command to copy it to the server. Instead of docker push you could use scp (for the tar), or rsync (with the local dir). You can actually just deploy the diff with rsync: it can work with 2 source directories. One would be a remote dir (the current version of the deployment) and the other the local (new version) and it would create a copy of the local/new version based on these transferring just the diff. Whether you want to try a new version of a lib or a new version of your app.

It's similar to how I deploy the projects. I use fabric (for copying files and executing remote commands, e.g. for db migration) in place of a makefile, but the idea is the same. Sure, I had to create it once but it's mostly reusable between the projects.

> Do you pay per VPS? Sounds expensive. Would it not be cheaper and easier to have a "cluster" of VPS and run Docker on them

Again, let's remember the context/use case I'm talking about: a company running their (monolithic) SaaS product. You don't have dozens of low traffic services that you want to host isolated. You have the same app all over your infrastructure with maybe a few services for ops purposes (monitoring, logging, error logging, etc). But you could just buy hosted plans for these. If not, you can stash them onto a few VPSs. Docker or not doesn't matter, whatever the most convenient way to install them. These aren't public facing.

So no, it doesn't result in unused/wasted capacity. What you suggest is workable of course, but you may end up with something that is more complex/expensive to operate than going with an existing orchestration solution. (Unless you suggest deploying something simpler than K8s, but still a third party/open source solution.) But even then, it's more complicated than not dealing with containers.

> So basically you're using docker in place of tar here really.

Not even in the slightest. They don't even compare... at all.

By using a TAR archive of you deployment files, you're completely side stepping all the benefits of isolating an application into a container, whether it's a monolith or 56 microservices.

First up: portability and allowing anyone, on any hardware, and any OS to run the application with a single command. All they need to do is install Docker and "docker run" will do the rest. With a TAR file based deployment, they need TAR, they need to know how-to deploy the application and manage the files; they may need some runtime on their system like Ruby, Python or NodeJS; and they cannot restrict the resources the application is using...

On the back of the above: for your developers to fire up a new version of the application that a collogue may have written requires muddling around in Git and using branches -- this interrupts work flows and can lead to lose of work if "git stash" isn't used right. With a container: they just run it.

Secondly you can restrict the resources the application is using inside of a container much easier than you can with one that is not.

I'll make one final point here: Docker Compose. You're missing out on a lot of free stuff here. It's so easy to write a Docker Compose file that brings up everything needed by the application - databases, caches, mocked APIs - allowing anyone in your organisation to bring up the application on their laptop and play with it, including the CEO, DBAs, QA, the receptionist.

You're ignoring all of this because... well I can't put my finger on it, to be honest. Maybe you enjoy hurting your self? lol

Good luck. I don't feel it's worth responding to future comments on this thread. You're clearly unwilling to listen or try new things.

> It's similar to how I deploy the projects. I use fabric (for copying files and executing remote commands, e.g. for db migration) in place of a makefile, but the idea is the same.

Haha! No, they're not the same idea. One is idempotent and the other is not. That's a big differences. And why Fabric? Why not Ansible or Salt Stack?

Man what a way to make your life more difficult for no reason.

When you deploy new versions of your app, is there a short downtime between the moment you stop the old version and the moment you start the new one?

At present, yes. It could be easily overcome with a simple A/B deployment mechanism.

I used to do that before switching to Docker swarm mode (docker stack deploy automates rolling deploys with zero downtime). In my experience, it's not that simple to do it correctly. You have to start the new containers, wait for them to be ready, then switch traffic to the new containers, drain connections to the old containers, and eventually stop the old containers. You need some kind of reverse proxy to do this, or at least iptable rules.

I'm curious: how would you do it? Is your app directly exposed on the Internet, or is it already behind a reverse proxy?

You're not wrong. It's not simple, but it's also not that hard, really.

Yeah I use a DigitalOcean load balancer in front of the servers. At this point in time the application serves my family and friends. It's not critical. And even when or if it does become public, even then I would just use maintenance windows.

Not sure what the benefit of zero downtime deployments are, to be honest. They make things complicated for a few seconds of downtime to switch traffic over. You can even make a maintenance page that's all fancy and checks for the backend coming back up and then redirects traffic... not that hard at all.

I might consider Swarm. Probably not though. I do agree those technologies can complicate matters but they can also make life easier. I guess it's a balancing act, isn't it?

Thanks for following up!

> Yeah I use a DigitalOcean load balancer in front of the servers.

Why the load balancer if the app only serves family and friends, and maintenance windows are okay? It should be able to run on a single server.

> They make things complicated for a few seconds of downtime

In some cases, it's more than "a few seconds of downtime". In one of the apps I maintain, some requests accept file uploads, and some stream long .csv files that are generated on demand. These requests can take several tens of seconds. If the system is unable to start a new version of the app, and switch traffic to it, while keeping existing connections to the old version, then we're talking about at least 1 minute of downtime.

We've thought about simplifying things using a "maintenance page that automatically checks for the backend coming back", but 1 minute is just too long for our paying users.

Some day, we deploy new versions multiple times. Our paying users would be unhappy about the multiple interruptions :-)

That said, with the advent of single-page applications, maybe it should be okay to transparently retry requests on the client side while the app is restarting on the server side (returning 503). No maintenance page. No complex zero downtime logic.

> I might consider Swarm.

We were already using Docker and docker-compose. We manage zero downtime deployments using docker-compose and nginx as a reverse proxy, scripted with Ansible but it was a bit too hacky for my taste. It's when I started to consider Docker swarm mode. And it's really great, even on a single-node, especially when you're already using docker-compose. I'm oversimplifying a bit, but it's basically running `docker swarm init` and replacing docker-compose with `docker stack deploy`. I recommend it.

> Why the load balancer if the app only serves family and friends, and maintenance windows are okay? It should be able to run on a single server.

I use DO to manage the DNS for the domain. I use their LB to manage the TLS cert for the domain. By doing this I don't have to manage TLS connections in my application nor deal with certification creation and rotations. DO are doing it for me for $10/month. That's a bargain given my time is worth $100/hour.

I can also easily swap out the backend server, or add more, as I see fit. This falls in line with your zero-downtime argument. The LB is there and ready to go should I want to attempt such things :)

> In some cases, it's more than "a few seconds of downtime". In one of the apps I maintain, some requests accept file uploads, and some stream long .csv files that are generated on demand. These requests can take several tens of seconds. If the system is unable to start a new version of the app, and switch traffic to it, while keeping existing connections to the old version, then we're talking about at least 1 minute of downtime.

But this is the key point of it all: your application and its users demand zero-downtime. My mother (in-law) does not :)

> I'm oversimplifying a bit, but it's basically running `docker swarm init` and replacing docker-compose with `docker stack deploy`. I recommend it.

Looking into it right now. Thanks a lot for the recommendation.

Right, DigitalOcean load balancer is really handy!

I'm curious to know what you think of swarm after you tried it :-)


It's very good. I will likely deploy my application to it and eventually, move an open source hosting platform I'm building into it too. It's simple and gets to the point.

Thanks for the recommendation :)

Hi. Glad you liked it!

I like Docker swarm lightness and simplicity, but I regret there is no official and built-in solution to automatically attach/detach volumes associated to a service (for example backed by Amazon Elastic Block Storage, Google Persistent Disks, or DigitalOcean Block Storage volumes).

I also miss the concept of pod (the ability to colocate multiple containers on the same machine and let them share the same networking stack), and the ability to launch one-off jobs (for example to run database migrations).

On a single-node cluster, these problems don't exist, which paradoxically makes Docker swarm a great solution for single-node clusters :-)

"Anyways, the point I am trying to make is you should use whatever is the easiest thing for your use case and not just what is popular on the internet. "

This is good advice in theory but in the real employment world you are killing your own career that way. At some point you get marked as "dinosaur" that hasn't "kept up". Much better to jump on the latest tech trend.

I get the sentiment here, but I don't think it's strictly true. The way I look at new technology is that I need to know enough about it to either discount it, or choose to use it. So long as I know what I'm talking about when I tell a prospective employer that I advise not using technology X, then they typically understand that I have the knowledge to make that decision.

So your advise should be, learn about the latest tech trend, try it out, and then have an informed opinion.

" try it out,"

How much time do most of us to really "try out" something deeply enough to have an informed opinion?

Informed enough to talk (read bullshit) your way through an interview. Honestly though - think about all of things on your resume that you said you had experience with (when it more of a resume of hope and less of a CV of experience). Was it actual, working knowledge that was applicable to your professional career, or was it passing knowledge from that time you followed a few tutorials?

I don't ask that to denigrate you. I did it. Lots of my peers did it. It's part of this silly game we play for employment. We complain about needing to pad resumes to get our foot in the door, but when we get to make hiring decisions, we automatically bin resumes of students who only put knowledge of one language and a handful of basic tools.

I don't know.

I've always found having an actual, functional product to be more impressive than a list of buzzwords on your resume. And most of the buzzword-driven development doesn't usually lead to a functioning system.

If you're buzzword compliant, you can fail forward. Your last product may not have worked out, but you've become an expert in Docker, Kubernetes, AWS, OpenShift, and Terraform which means that companies who are committed to the cloud (i.e., everybody) won't pass you over.

Built an app that works before the cloud hype hit? Congratulations. You're a specialist in legacy technologies. We'll call you if we have a COBOL or Perl app that needs maintenance. snicker

> That so many are ready to live by luck, and so get the means of commanding the labor of others less lucky, without contributing any value to society! And that is called enterprise! I know of no more startling development of the immorality of trade, and all the common modes of getting a living. The philosophy and poetry and religion of such a mankind are not worth the dust of a puffball. The hog that gets his living by rooting, stirring up the soil so, would be ashamed of such company. If I could command the wealth of all the worlds by lifting my finger, I would not pay such a price for it.

-- Henry David Thoreau, "Life Without Principle"

I believe that is called Resume Driven Development. I see it in my org quite a bit. There are pros and cons. One of the cons is the academic exercises that have no bearing on customer requirements and last far too long. Mgmt won't kill these projects off for fear of losing good talent, so we end up with loads of shiny things that nobody wants to support and nobody will admit that we don't need.

"If you're buzzword compliant, you can fail forward. "

Exactly. You don't even need to have delivered something useful but you are still a hot commodity on the job market. Same with a lot of "data scientists". Having used the tools with or without success is sufficient to being hired.

No companies worth working for hire that way. What you are describing is a way to get jobs at startups or non-tech companies that don't have a good grasp on fundamentals.

When your only alternative is starving, you tend to adjust your expectations of a company "worth working for" downward. I've been faced with that alternative enough times in my life to think it wise not to rule out any options.

From my experience a lot of companies (not all) work that way. I am little stuck in some legacy tech and during interviews it looks like they can't even conceive that someone may learn their stack quickly.

I'm so annoyed that this is true. I hate this, and I'm in the JavaScript world, which is worse than anyone else in this regard as far as I know.

Incidentally, this "don't follow trends" advice ironically seems to carry the most weight when it comes from tech celebrities, frequently those who built a name by being experts in trendy technologies. When someone who's not trendy says "don't follow trends," either no one listens or it's taken as proof that they're a dinosaur. (No shade on Jess intended here, she didn't create this phenomenon.)

Most organizations don't need to manage servers or Ansible playbooks either.

The reason Kubernetes became so popular is because the API was largely application-centric, as opposed to server-centric. Instead of conflating the patching and configuration of ssh and kernels with the configuration of an application, you had clearly separate objects meant to solve different application needs.

The problem with Kubernetes is that to gain that API you need deploy and manage etcd. To bring your API objects to life you need the rest of the control plane, and to let your objects grow into your application you need worker nodes and a good grasp of networking.

This is a huge burden in order to gain access to K8's simple semantics.

GKE helps greatly, but the cluster semantics still come to the forefront whenever there's a problem, or upgrade, or deprecation, or credential rotation.

Of course there's always a time for worrying about those semantics. Specialized workloads might have some crazy requirements that nothing off the shelf will run. However I think the mass market is ready for a K8s implementation that just takes Deployments and Services, and hides the rest from you.

In lieu of that, people will just continue adoption of App Engine and other highly-managed platforms, because while you might not need Kubernetes, you almost certainly don't need to go back to Ansible.

Ansible isn’t just ssh though. In principle you could have a k8s_deployment role for example.

Most playbooks are host oriented but one write k8s playbooks that are cluster oriented

I honestly don't understand the amount of negativity towards dockers and kubernetes sometimes.

All major cloud providers have a managed k8 service, so you don't have/need to learn much about the underlying system. You can spend a few days, at most, to learn about dockers, k8 configuration files and helm and you're pretty much set for simple workloads (and even helm might be overkill).

Afterwards, deploying, testing, reproducing things is, in my opinion, much better than managing your applications on random servers.

Might I be wasting some money on a k8 cluster? Maybe. Do I believe the benefits outweigh the money? Absolutely.

I honestly think this website's negativity towards stuff stems from not understanding use cases and being a general curmudgeon.

"All major cloud providers have a managed k8 service, so you don't have/need to learn much about the underlying system. You can spend a few days, at most, to learn about dockers, k8 configuration files and helm and you're pretty much set for simple workloads (and even helm might be overkill)."

This is the reason why I use k8s. It is ridiculously easy to deploy applications and I don't have to worry about hardening the VM.

I am interested in people's opinion on the "break even point" between using Kubernetes and not using Kubernetes. Let's pretend that the only options are Kubernetes and something substantially less powerful.

What is the simplest/easiest personal project where using Kubernetes might be justified?

I am a junior software engineer trying to figure out how to contextualize all of these container/container management systems.

This is a little bit negotiable, but it's where I'd start considering Kubernetes:

1. at least six independent twelve-factor-app services with their own datastores and a need for high availability across all of them and a near-complete understanding of the high-availability interactions between instances of your services

2. an inability to predict ahead of time where your system's hot spots are, necessitating rapid scaling of different parts of the application

3. a willingness to overspend on capacity to be able to respond to scaling events or deploys in seconds rather than minutes

4. an code-focused ops team (as opposed to a mouse-driven ops team) with extremely strong diagnostic skills and the bandwidth to babysit a service with a potential pain-in-the-ass ceiling around that of a Cassandra cluster

Without #1, you don't have enough variation in systems to benefit; just stick a monolithic application in an autoscaling group. (Most people should do this.) Without #2, you can lean into the hot spots of your application by scaling them horizontally--bear in mind that you'll be paying for capacity you don't use with k8s in order to get that environmental reactivity, so you could just spend that on making your hot spots faster. Without #3...well, that one's pretty obvious when you look at things like EC2 instances, which are more easily partitioned, can be spun up in smaller/cheaper groupings, and their primary downside is that it takes longer than deploying a container. And without #4, you're gonna go off the cliff.

Reasonable people can nibble at the edges. But to answer the thrust of your question: it's probably never reasonable to design a personal project around k8s unless the point of the project is to be done on k8s.

> 4. an code-focused ops team (as opposed to a mouse-driven ops team) with extremely strong diagnostic skills and the bandwidth to babysit a service with a potential pain-in-the-ass ceiling around that of a Cassandra cluster

Here it is running fine... running fine.. running fine... aaaaand there's a compaction-and-gc cycle of death and fire and lost data and tears. Thank you for this terrible memory.

I was going to say "we've all been there," but we haven't, and that's the deceptive thing about the five-minute-demo culture that a lot of "devops" has gotten into.

Everything is easy when it has nothing riding on it. When it isn't is where the value of a tool comes into focus.

Yeah I'm dealing with crap out of that area atm.

I've been recently asked why I'm extremely restrictive and careful with our primary production cluster. Well, we got 20k+ full time employees of our customers depending on this system for their everyday work. An hour of downtime of this thing will cost our end customers 20k man-hours of work done in a worse way.

We're not touching the tooling this system sits on without good reason and a lot of testing. And even then I'll be bloody scared. Sorry modern world, but in this case, I'll be wearing my hard ops hat.

IMO, Kubernetes is one of those things where if you have to ask, you don't need it. It's only really "justified" if you're actually using features like:

- High availability services (more than one copy running at once).

- Service discovery (services talking to each other in a resilient way).

- Ability to automate operational tasks.

- Rolling deployments of services.

Very few personal projects will tick those boxes -- by the time they do, they've usually evolved into a full-on "real project".

Doesn't mean you can't mess around with Kubernetes for fun and learning, of course. But from a purely practical perspective, it's overkill unless you have all of those requirements above. (If you just have one or two of those needs, there is usually a simpler tool to fulfill it with less overhead).

If you need high availability, you get into the second order effects: consider the risk from the complexity of the HA setup, your lack of experience with its failure modes, and lack of low level access to the managed kubernetes service? If you are not a seasoned SRE, there are a lot of "unknown unknowns" for you waiting around the corner.

Yeah, great point. Kubernetes done right can help with these things, but done wrong, it can cause more problems than it solves. Of course, for a personal project, hitting those "unknown unknowns" is all part of the growing process, but in a business context I would be even more hesitant to adopt K8s unless you already have an ops team that's ready to support it.

Like with anything, it should be evaluated based on what your needs are.

For a simple deploy you probably don't need Kubernetes or even containers.

If you are running containers, you'll need a mechanism for running them. And maybe at some point you want something to recreate them when they die or become unhealthy. Maybe you want to run multiple containers, and you want to do rolling deploys of them. Maybe you want to run them on multiple hosts and network them together. Maybe you want to be able to attach a persistent disk to some of them, or interface with some secrets management software. And maybe you want a single, well-supported API for doing all of the above.

There's a lot more that can be said about Kubernetes; it offers a lot out of the box as well as an API for extending it when you need behavior it doesn't provide.

Maybe someone here can help me figure out what I need, since the world of containers is growing faster than I can understand.

I have one code base that I run on multiple servers/containers independently of each other. Think Wordpress style. I used to run it on Heroku but I switched to Dokku because it's substantially cheaper and I don't mind taking care of the infrastructure. I like Dokku but I do worry about being tied to just one server and not being able to horizontally scale or easily load balance between multiple servers/regions. Ideally what I'd like is Dokku with horizontal scaling built in. I've seen Deis and Flynn but they seem less active/mature even than Dokku, which is saying something.

Is Kubernetes the right answer here or should I stick with Dokku and forget about horizontal scaling?

Kubernetes isn't the only thing around. Kubernetes and Mesos are kinda the heavyweight solutions, but there are smaller things around like Hashicorps Nomad and Swarm, and probably a lot more I don't know.

We're currently evaluating nomad, and it's surprisingly pleasant. Nomad doesn't solve every problem every application in every situation might have. Nomad schedules containers, VMs or whatever else on hosts. This reduces complexity a lot.

It took us like 1 - 2 man-weeks to have an almost arbitrarily scalable nomad setup which allows you to submit a bunch of jobs and mark some public ports for a loadbalancer, be it mysql, http, whatever. And it's easy to understand and operate. There's 3 nodes of consul, 3 nodes of nomad-server and 2 hosts of nomad-client, some certs in the middle, consul-template + haproxy with a config almost from a blogpost. That's it. It has very few moving parts and it's easy to understand and troubleshoot with 2-3 main guys in our ops team. (EDIT: this doesn't read clear. We have 2-3 guys on our ops team. They are not working on nomad alone. Nomad atm is a low-maintenance system and we're mostly dealing with other crap /EDIT)

And now we're just going with it for now. Our CI needs resources for on-demand test-systems, so let's figure out how to make that happen. Our self-service test system for demos / manual acceptance testing needs resources for systems so let's figure that out. We might need to use gluster or something for persistent storage if we want to migrate internal tooling to this. A sister company might want to tinker around with windows VMs scheduled by nomad, or windows containers, so why not?

But the good thing: It took us 2 weeks to start delivering business value. That's a relatively small up-front payment for an established company, even a small one. Now we can leave it alone for some time, or we can invest some more well defined packages of time to make it better in concrete, requested ways. That's easy to schedule and prioritize.

Just came here to second this. We evaluated both Kubernetes as well as Nomad for a relatively small cluster of some worker nodes and web services. In the end, the ease of standing up a Nomad cluster and the whole feel of the thing won us over.

Nomad is a single golang binary that you can run on your laptop and have a fully working working Nomad client and server, along with a builtin UI and command line tools (same binary). The story for production is the same: Throw the binary on your server, setup a systemd unit to run it and you have another Nomad node.

If you're evaluating container schedulers and are not sure what you need, take an afternoon or so and just run it locally and play with it. If there aren't any specific features about Kubernetes you could point to that Nomad does not meet my suggestion would be to get started with it first.

Kubernetes is a single binary, too (hyperkube).

Can you give an idea how big your cluster is? 10, 10s, 100s?

I'm curious how well Normand works in reality, but it's hard to find ppl running it.

Right now we're 4 weeks in and IT and 2 guys want it. We're 3 server nodes + 2 client nodes big, 128G RAM, 16 cores, 32 threads. Please go ahead and call this tiny and irrelevant. I'm aware and under the same assumption. I've dealt with individual servers around 4x that size.

But, go ahead and ask me again in 3 or 6 month. We'll migrate around 10 HW servers as nomad clients, we'll probably get HW capacity from our sister company and then we'll start interacting with the 2 other sister companies. If we get all of their resources into nomad, we'd be up to 60 - 80 beefy metal servers. And those 2 other sister companies are in pain for easy windows resources. If nomad can do that properly, I'll be sold for it quite hard. It'll be fun.

I'm not the person you're directly replying to but I'm the guy who originally asked the question at the top level, and I can say your needs right now are far bigger than mine. Right now my 5-container Dokku host has 4GB of RAM with 2 CPUs, but we're expecting to ramp up quickly.

It's good to hear your input as I expect to be a similar size to you in a year or two. Far from irrelevant. I'll have to look into it.

The problem is if you come on here and talk about how great nomad is, then only later say you are only running five nodes, that gives people false impressions. Kubernetes is tested regularly with 5000 nodes. To get to that level takes an entirely different level of software to be able to run a cluster that large reliably without issues.

Isn't the entire purpose of this threat that Kubernetes is a big solution made for big problems and smaller problems might not need that big of a solution?

The comment was in response to a question I asked about "do I really need something as big as Kubernetes". Not sure why people are picking apart this answer when it seems to answer my question quite well.

The purpose of this threads, and threads like it, is they turn into a "nobody needs kubernetes" when xyz exists, and they completely ignore the fact that kubernetes is not meant for small shops.

I didn't see yours was in the context of a small cluster, since I was reading the whole thread, so I apologize. It seems that every single comment is roughly the same as yours, though, and they act as if anyone who chose kubernetes/openshift is clearly an idiot. There's a reason big clusters use it, but the people running the large clusters don't come on HN to comment. Look at the openai blog, for example.

There's a reason the community is so large. There's a reason people are developing tons of tools around it.

It's not because they're stupid.

The article mentions Stellar (https://github.com/ehazlett/stellar/) which may be what you need.

I am in the exact same boat as you are, and would love some feedback as well! I have been considering Kubernetes so that I can allow horizontal scaling, but haven't done it yet.

For my use case I am VERY well aware that Kubernetes is overkill, but I don't see a great middle ground between Dokku where I am now, and Kubernetes.

I have found Docker Swarm Mode to be a good middle ground. Essentially, it's docker-compose which you can run on multiple servers.

I've started experimenting with Swarm myself but I know one guy who has a histamine reaction every time I mention it. I haven't gotten (or really tried to get) a straight answer out of him about why it's a bad thing.

If you have a countable number of instances of a couple dozen services and one or two "lots of instances" services it seems like it should suffice just fine. And really, my brief sojourn into Kubernetes left me feeling like I bumped into road blocks for anything more complicated than that. Especially with sidecars, or if your tool chain uses multiprocessing instead of threading (Rails, Node, Python?, etc) to deal with concurrency.

If there's nothing wrong with what you have, keep it.

There's also theoretically nothing stopping you from setting up a similar server and stand up a load balancer for incoming traffic and use that if you need some redundancy. That's roughly what I do but with docker-compose and a cron job

Tooling like Kubernetes might make sense for larger operations but, for me at least, it doesn't seem to make sense and won't make a significant difference in my day to day administration

Given the limits of applications like flynn, deis and nomad, and having tried all of them, I would be inclined to point you towards just going full kubernetes (in specific, rancher 2.0).

It’s a fair amount of effort getting your head wrapped around kubernetes, but the same can be said for the other packages.

At least kubernetes seems more likely to be useful in your further career as well.

Never really tried it, but when I was looking for Dokku and Flinn alternative, I found this: https://captainduckduck.com/

IMHO, if you are lucky to have the problem of scaling beyond dokku on a beefy, dedicated server, it's time to hire someone for the devops. YMMV depending on your stack, of course.

I haven't used it myself, but https://flynn.io/ seems to be targeted at that use case.

Last release was in 2017. I don't think flynn is still alive which is a shame.


not at all in the loop here..

but release != alive; alive != new code, etc etc etc

I always check the Contributors graph of an OSS project I'm thinking about using: https://github.com/flynn/flynn/graphs/contributors

I found Flynn.io to be a good step up from Dokku - clustered, no containerization knowledge needed.

You probably don't need microservices either - it's insane how much money and time is being thrown away to these industrial strength hammers by companies that simply don't need it.

So true! I think the Rick & Morty reference alone speaks volumes for everything. haha

Depends on the scale. If I only have 10 containers to manage I'd throw them on a m4 and let it be. Benefit of using k8s kicks in when your use case gets complicated.

This. At some point you get tried of trying to find an underutilized box to launch a new service. This is when you should start looking for something more complicated.

> This. At some point you get tried of trying to find an underutilized box to launch a new service. This is when you should start looking for something more complicated.

But that is the point of containers. To abstract away the need to worry about "finding underutilized boxes". The second you said "find an underutilized box" my mind went immediately to a pile of what I call "special snowflake" boxes. Each one is it's own unique thing. Symptoms usually include having cute names for each server instead of random, machine generated names. That shit is hard and expensive to maintain--especially in a fault tolerant way.

The best way to treat your infrastructure is to make it as ephemeral as possible. Absolutely any part of it should be able to go out of commission at any time and a new instance should go online to replace it.

If you are loading up multiple things onto a single m4 instance, I'm gonna say right now you are using AWS wrong....

That should have been:

At some point you get tried of trying to find an underutilized box to launch a new docker container.

At this point any service should just be a docker container but you still have to find a home with enough memory, cpu, and temp space to run the container.

Perhaps three parallel m4's distributed across availability zones with identical sets of containers deployed to them, but yes.

even m4/m5's are already on the expensive side compared to something like hetzner especially if you don't need crazy peak scaling.

What's m4?

Probably the AWS m4 instance type, meant for general purpose workloads (and replaced with the m5 instance type about a year ago).


AWS instance type.

"M4 instances provide a balance of compute, memory, and network resources, and it is a good choice for many applications."


GP probably references an AWS EC2 machine type: https://aws.amazon.com/fr/ec2/instance-types/

I guess they're taking about the AWS EC2 instance type


Its the bigger brother of m3.

AWS ec2 instance type

Its a bit funny to ask this question in this thread, but here we go:

What are the important topics & technologies to learn about with these types of topics? My uni experience didn't really include things like distributed systems or containerization.

Ideally fundamentals that won't be invalidated in 5 years when 'the new thing' becomes something else.

(Love good book recommendations on any subject a new grad should learn, not just this topic)

Kubernetes may be overkill for small projects and it's actually hard to setup for a single-machine cluster, but the idea of container orchestrators (k8s, docker swarm, nomad, etc...) is extremely useful, I understand that some abuse the word "scale", but for me container orchestration is far bigger than just scaling, these features include:

1. rolling updates

2. decoupling configs and secrets from code and mounting/changing config files easily

3. robust and predictable testing/production environments

4. centralized logging

Also microservices's goal isn't really about just "scaling" in my opinion, there are other important advantages even if you have no intention to scale, aspects like modularity, separation of concerns, robustness and lowering the technical debt are still as important whether your app serves 1 or a 10000 users at the same time. Of course you can pull your python app from your repo or even rsync it (just like you can just develop any software without using git or any revision control) and just execute it might work very well, but sooner or later you are going to regret it if you're a business

It was interesting to note about workers and using web assembly together within V8 as this scenario could bypass the need for complexity and memory overhead, while combining different programming languages on the server-side. Not that it could replace Kubernetes as that is an amazing technology but if you are in a scenario where your tech could fit within workers, could be interesting. https://blog.cloudflare.com/introducing-cloudflare-workers/. I was amazed to think web assembly would be used for that purpose but i guess it does make sense in reading about how it is put together.

What bothers me about k8s is that it promises a lot ("15 years of experience of running production workloads at Google" at your fingertips! yay!) but it's in fact still a young, ever-changing solution.

Even developing an app locally with minikube is a PITA for a lot of reasons. From Helm to Telepresence to Skaffold, every tool out there is just unpolished and overambitious.

Don't want to imagine how those problems might amplify in production.

Skaffold is only 5 months old. That's a little unfair to call it unpolished an over-ambitious.

It's made by Google which boasts about its 15y experience on containers?

I've been using skaffold for about 3 months. It works well, and there continually improving it. I'm not sure what else is expected from a new project.


Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact