Hacker News new | past | comments | ask | show | jobs | submit login
The full-time job of keeping up with Kubernetes (gravitational.com)
484 points by twakefield on Feb 1, 2018 | hide | past | favorite | 223 comments

There aught to be a name to the tendency that as tools get better and better, the more your time goes from having your mind in technical-space to social and news-space. It's like the authority to create goes from the individual first-principles (by necessity) maker, to the control over development being in the hands of an external group, and then all your time is spent keeping up with what they're doing. A similar thing happened with a lot of javascript frameworks. It also happened with the transition from building servers from the ground up, to it all being managed by AWS.

I wish I could give you more upvotes. You are describing a somewhat hidden psychology, which I think provides a rational basis for much of "Not Invented Here" psychology. We tend to think that "Not Invented Here" psychology is irrational, but in fact, the loss of control over possibly crucial technology is an important cost, which makes all of us stop and re-consider whether we really want to use some software developed by an external team.

And it's not only that - Time spent not doing our own designs (and instead spent memorizing how to use magical frameworks) is time not spent advancing our technical understanding.

It's a sad state when otherwise very intelligent people think it's bad practice to use plain C and a clean OS API that has been stable for decades (because that was somehow "magic" und impossible to understand and error-prone), and advise you to use the Boost filesystem module just to concatenate two paths.

> Time spent not doing our own designs (and instead spent memorizing how to use magical frameworks) is time not spent advancing our technical understanding.

I can't upvote this enough. I used to do mathematics, and there was the story of a professor would would hold up a book and say, "You should know everything in this book. But don't read it!" Which is to say, you have to go through the process of discovering mathematics to really understand it (maybe with a bit of guidance when you get really stuck). The skill of building complex software systems is no different.

Sure and those of us experienced in writing software tend to avoid reinventing the wheel (in a buggy, untested way).

I see the value in learning by building yourself, but from a software engineering point of view, using a tried and tested framework is likely to give you higher quality product in less time.

Well, the framework I was to use last (Qt) had bugs that simply can't be fixed by users (memory leaks, double free leading to segfault when exiting after reloading QML engine) and immature modules (for example translation) and forces complicated types to the user and forces bad architectural decisions to the user and significantly increases compile times...

> using a tried and tested framework is likely to give you higher quality product in less time.

This is a common sentiment, but note that a framework has huge handicaps

- not knowing your business requirements and concepts

- must be suitable for many software projects that need features you'll never need.

- there is a clear maintenance boundary (framework vs your own code), which requires a complex interface with maintenance overhead and typically forces you to use concepts that don't really match your requirements

If you're experienced in the relevant domain it's almost always simpler to do it yourself / reuse your own work.

Writing your own QT sounds like a long road however that would likely not be as fast as making work arounds or providing bug reports or patches.

I would never write my own Qt. Why would I? In this case I was creating a simple dashboard type application, and that can be easily done using fewer dependencies.

I don't think reinventing the wheel is the right approach. I think careful analysis of bugs and design deficiencies the likes of which you experienced before jumping in and using the framework is the way to do it.

This is why I absolutely love seeing hate filled developer rants about technology with deep dive analyses and links to bug trackers. That's how I was saved from learning Ruby on Rails back in 2008 when most developers gushed about how awesome it after reacting to its slick marketing and building a tiny website in 5 minutes.

What did you prefer instead?

I learned Django instead. I'm cognizant of its flaws but I'm still relatively happy with it.

I think the designers originally intended it to just be "rails for python" but they recognized the pain caused by rails magic and subsequently worked on making django less magical.

Concatenating two paths is difficult if you care about any of the following: security, multiple OS's, Unicode (multiple code units, validity, combining characters), file system restrictions, etc.

Your "two decades" maybe holds for Linux, but what about Windows or MacOS???!

I have seen too many people use string concatenation.

I think an intelligent person would recommend to use the normal library (appropriate for your language, assuming it is well written) since usually your program will be doing many other filename/path manipulations too.

Exactly my point - that function doesn't deal with modern windows. MAX_PATH is 260 characters. Yet Windows now supports longer paths (\\?\ prefix). So I presume there is another Windows function to combine long path names (and maybe canonicalise Unicode better).

> Yet Windows now supports longer paths (\\?\ prefix).

I know. And this is the attitude that leads to complex software. It's a self-fulfilling prophecy: "I can't do it on my own since the problem is so complex". In which case the problem does get complex.

I recommend not wasting time supporting this crazy feature (unless you are writing infrastructure code for tools and you're required to - in which case I'm sorry). 260 character paths are more than enough for any project. And I certainly recommend against using the crazy abstractions from Boost::filesystem.

(Personally I think it's an unfortunate example. I'd prefer to avoid paths from the start, since nobody understands the semantics of hierarchical filesystems. Alas, typically you need to deal with them to some degree).

It is not a crazy feature.

It is a necessary feature for many development environments originally written for Unix (e.g. nodejs), because Unix file systems usually don't have such a low character limit.

Or maybe Windows has mounted or network sharing, a Unix filesystem... In which case your program better deal with long paths (or just fail?)

Really... You are showing exactly why one uses a library so one doesn't need to care about the "complex" details because one hopes the library does a good job of managing it as well as possible for you (although you still need to know the details to use a library function safely).

> Or maybe Windows has mounted or network sharing, a Unix filesystem... In which case your program better deal with long paths (or just fail?)

260 bytes is long. I've never seen a \\?\ path in the wild, and I don't want to. It's a misdesign (well, I'm sure the designer didn't want to design it...). But I'm repeating myself... I can't practically deal with paths longer than, say, 64 characters (can't read them, let alone type them). Fix your paths.

And again, this is off-topic. This thread was about not using a crazy third party library, when the authoritative semantics (however insane) are contained in the OS interface.

That is such an awful toxic viewpoint to take.

"Fix my paths"!? Fix your software! I work from a network share that is about 70 chars long to get to my one project's folder. The beauty of hierarchical file systems is that I can cognitively ignore all the previous paths and just work from there. However when your software doesn't work because "the solution is a misdesign", it's not my path that is wrong, it's your software that is broken and you that are stubbornly refusing to use solutions to well known problems.


>Still supported by 260-byte filepaths.

Because now programs that thought "260 is enough" now only have 190 left, and i've seen (and had software not work with) several pieces of software that have more than that. Because the assumption is that this is a solved problem on all OSs, and that 260 is no longer a hard limit.

>Asking nevertheless: Why? The path to the network share should be easy to abbreviate.

Because that doesn't actually solve the problem. I can abbreviate it on my connection, but not on the server, and if the server is running some poorly written software, it will choke when from my point of view i'm only using 190 characters. As for why, it's segmented by country/company/department/team/name/employeeid/

Then it is my folder, where I have things like clientInformation/[softwarePackage]/[client]/[testFiles|documentation|uniqueArchitecture]

Just the example without any real names is over 100 already. I just went and looked, I was being conservative with my 70-character estimate, a few of my folders are over 180 characters long just to the [client] part of my path.

Also why abbreviate it when we can write software that can handle that for us? Why should I need to shorten my paths to potentially confusing and misleading names just so some people can ignore solved problems in the name of "complexity". Not to mention that abbreviation isn't really a "solution" (what happens when there are collisions? Now you need a complicated "maintained and interpreted by humans" system to manage it, that's not simpler, that's more complex).

I'm honestly not going to respond to the rest, because we both know that those are strawman arguments.


But this is solvable! You are the one claiming it's too complex then refusing to use the well tested solutions...

>You have insanely long paths, and that's still miles away from a reasonable limit (256 bytes) but still complaining.

Because that's just the path to a folder, that still doesn't contain any data! unpack an archive from a server in one, and suddenly poorly written software is blowing up because someone used a GUID as a folder name, or there is software running there that assumes "why would paths need to be less than 60 characters!?" and nests files into a bunch of folders using the filesystem as a tree of hashes.

I think the problem here is that these aren't imagined problems, they are problems I've run into in the past year or so. The difference between this and your strawman arguments is that there are solutions to all of those problems as well, but for some reason you don't see them as superfluous.

I don't know why I let myself get roped into a pointless internet argument. "You clearly lack experience with low-level issues"? Man I'd love to see how you determined that from a few comments about file paths...

> Your "two decades" maybe holds for Linux, but what about Windows or MacOS???!

Is my target application going to run on Linux? If so, then I do not care about Windows or MacOS, just like I do not care about CBM64 or Amiga.

My goal is not to write an awesome system that wins design awards in handling of obscure edge cases thrown at it in a coffee shop by hipster developers. My goal is to build a system that solves my business problem.

...and since you wrote straightforward code not relying on obscure platform-specific technicalities, even if you wanted to port it later, the port of the filesystem code was just a few #ifdef's away.

Time spent not doing our own designs (and instead spent memorizing how to use magical frameworks) is time not spent advancing our technical understanding.

That's a false dichotomy. The time it takes to "memorize a magical framework" is far less than the time it'd take to learn how to write code to do what the framework does. Consequently you can learn a framework and some other technical understanding in the same time it'd take you to only learn enough to implement your own version of the framework. In most circumstances that's actually more beneficial to do that. You'll be further forward in your understanding of the technical stuff.

You're also assuming that frameworks are written by individuals. They're not. In the case of some large frameworks it'd be practically impossible to implement what they cover on your own. You simply can't learn the underlying principles and then implement them all in code yourself.

It's definitely worthwhile learning the basics of the languages you use, and you should be working on things that improve your code and understanding as much as possible, but it's very likely in most cases that will mean building on top of someone else's existing code rather than implementing everything yourself.

These are still claims without any context. It mostly depends on how experienced the programmer is. And most importantly, note that one never needs all the functionality from a framework. Typically it's only a very small part, and often the existing functionality in the framework does not match the requirements 100%.

Very often the match is good enough that it pays off to slightly align the requirements instead of patching the framework. The amount by of man-hours poured into the very core parts of rails for example, just processing and dispatching requests, safely decoding the input from a webserver to a useful set of parameters, routing the request to the proper handler and rendering and returning the response is huge. Certainly, you could take something slightly more modular, such as padrino, but that’s still a mind-melting amount of code if you look at all the libraries and dependencies.

You could reimplement most of the basics, but that would be month or years of work and probably still buggy as hell. I’ve seen my share of “oh, we’ll just build our own framework” and they all turned out to be much more complex than the initiatiator expected.

I would think this claim is self-evident

> In the case of some large frameworks it'd be practically impossible to implement what they cover on your own. You simply can't learn the underlying principles and then implement them all in code yourself.

Not even DHH would claim to have been able to build the whole of Rails by himself.

I would disagree about that.

Learning Django taught me a lot about the proper way to things. When I first started using it I implemented a lot of my own stuff myself (as I wasn't aware that the framework had certain features). I basically wrote my own equivalent of class based views before I understood Django's own. (http://ccbv.co.uk/ helps understand them a lot)

Reinventing Django's class based views is something that I have seen in a few inherited apps (like where I currently work).

If your application is going to be something long lived and gets a number of developers while in maintenance mode, then memorizing a standard framework is a good thing - new developers should be familiar with how a framework works - as opposed to needing to spend time going through someone else home grown code.

Then there is the issue that writing your own code will be untested relative to a framework that has some level of popularity.

Documentation is likely to be better with a framework as well.

(This is my perspective coming from a Python / Django background - I have noticed that JavaScript frameworks and libraries often have a lot more problems).

The most valuable skill is knowing when to externalize your tools. You don't always want to reinvent the wheel every time you need something, when you have deadlines to consider.

Exactly this. There's a ton of gray here. Have to pick your battles as best as you can.

Another example are game engines. Hard deny the value engines like Unreal and Unity provide. They are hard to ignore and have thousands of expert hours put into them.

A good way to pick battles is not to fight most of them. There are so many solutions that don't have any real problems...

Could you provide an example of solutions without real problems, game engines or otherwise?

Basically anything overengineered? Concrete example: C++ "manager" objects. Why not finally learn how to structure applications and keep allocations and resource use in check? The program will be so much simpler, compile quickly, be easy to understand (execution threads stop jumping around like crazy), and typically have less memory leaks / use-after-free etc.

RAII, garbage collectors and other fancy inventions for freeing resources from the call stack automatically? Not needed. Use global resource managers. Globals are needed and semantically the right thing. Not a problem.

Any crazy programming language with all the features they could conceive? The one you need is guaranteed to be missing. Better, express your problem yourself and write a simple generator script that translates your concepts (mostly as plain old data) to the minimal language.

Another: XML (and even JSON), possibly for anything except true document markup. It's slow and doesn't buy us anything. Yes, there are ready-to-use parsers, but after parsing you still hold a mess in your hands (and only strings / floats). Finally learn database basics and manage data in tuples. Make a text file parser that takes simple table (= list of column types) descriptions and then parses tuples from single lines and puts them as typed data in arrays of structs, done!

How does a "global resource manager" work and in what way is it better than RAII and garbage collectors? Googling wasn't helpful.

Just global state. Simple example, a global variable that holds an allocated memory buffer. You initialize it at program startup and tear it down at the end (you can be sloppy and leave out the latter). While the program is running, you re-allocate as needed.

This is better in that it really is a very simple thing to do. There are no possible memory leaks. But what happens is much more explicit - you have precise control, and no bumpy control flow (much nicer for debugging).

But it doesn't have to be memory. If you want to look at my project https://github.com/jstimpfle/learn-opengl (careful - it's not tidied up. But I think it demonstrates my point), most resource state there is global module-wide state. I simply have init_module() and exit_module() pairs that I call from the main function. Problem solved. Not a headache at all.

Is that in support of using the Boost filesystem API?

IMO, only if you're using g++ or older versions of the standard. MSVC, clang and ICC have all supported the experimental::filesystem module for years at this point.

I don't use C++ outside of games programming, I'm not familiar with the Boost filesystem API.

It depends on what you're doing with it, I'd guess.

The amount of security exploits due to memory corruption in software written in C proves that during the last 50 years, it isn't that well understood.

Recently, I spent some time trying to run several machine learning jobs simultaneously across AWS machines. This was a fairly simple use case: all the jobs were totally independent of each other, you could run them by calling a Python function with different parameters.

I'm mostly a stats guy and not much of a programmer. I got a hacky, do-it-myself version using Python scripts up and running with about two hours of work, and learned about Python threads for the first time in the process, a tool that I can reuse in many different areas.

I then tried to do this the "right" way, which according to Amazon is to use Docker and the AWS tools ECR, Batch, and SQS. It took me about 10 times as long to get that working. Yes, this offers much, much more functionality - but most of it is stuff I didn't need. The only real gain I got was my models running about 20% faster, and the knowledge I learned is ephemeral.

I also like to ask myself, "Do I want to become an expert at working with external software X, or do I want to become the kind of person who can build software like X?"

That's great. However, how does the time to solve the business problems fit in?

That's a great heuristic to keep in mind, thanks for that.

Worth dropping a link to this Joel Spolsky article, where he discusses this concept and talks about the fact that the Excel team (in the 1990s) had their own c compiler:


I am glad I read this just now, I was jumping from hoop to hoop. and not only hoops but craft paradigms, reminds me of the Fred Brooks idea "There is no silver bullet".

Brooks argues that "there is no single development, in either technology or management technique, which by itself promises even one order of magnitude [tenfold] improvement within a decade in productivity, in reliability, in simplicity." He also states that "we cannot expect ever to see two-fold gains every two years" in software development, as there is in hardware development (Moore's law)

from wikipedia

Glad it helped. Brooks should be required reading for every developer: software is hard.

Sure, NIH syndrome can be rational for technology that is crucial/central to your system.

The problem is it is often used to justify re-inventing even mundane stuff. I once worked with a client who wanted to implement their own bug tracking system. The client’s main product was something totally unrelated.

For 99% of companies, including the ones employees post here container and container orchestration is a resume driven development of devops/engops/sre/seniordevs caused by improper hiring, improper vetting of ideas and personnel and fixation on cargo cults.

It is no different from cargo culting in sales where a perfectly functioning company slowly but surely building up brings in a CRO (can you imagine this showed up as a title?! ) who says "We will sell differently! Give me account managers! Give me sales development representatives! Give me customer happiness coordinators and we will sell to enterprise accounts at 50x contract value!" So the company hires a hundred people in those roles and it does look like the contracts are creeping up. So the CRO says "I know! The issue is that we are spedning too much time on paperwork. Hire me salesops! And give me all these Sales Force integrations! And special IT people reporting to me operating these new tools" So the company hires more people for those roles, burning through millions of dollars in salaries but at the end... the new customers are still just a trickle.

Eventually CRO gets fired and most of the people who got hired on that push are gone as well as millions of dollars are spent. If they spent those millions on Google ads or Facebook ads they would have definitely gotten more revenue but plain Google ads and plain Facebook ads are not sexy.

If even mundane stuff keeps breaking your workflow and demands all of your time to keep up with, then all the power for reinventing it.

All the better if it is simple.

Bug trackers are one of those areas where a lot of companies should build there own. Everyone has there own workflows and information to capture and end up either conforming their process to the bug tracker or spending more time configuring the bug tracker than they would to build a new system from scratch. Those uber configurable systems always suck to use.

Ones like JIRA can takes weeks to setup for your org and include their own query language. All of this complexity just to do something so simple is not rational.

As long as you don't over engineer it then it's only a days work to get something up and running and it's a great project for interns.

> Bug trackers are one of those areas where a lot of companies should build there own.

This sounds insane. I have not worked at a place where the workflow was so holy and important that it couldn't be captured in a near default JIRA install.

Most of the customization asks I've seen with JIRA come from dysfunctional organizations that demand new swimlanes like "QA" and "product approval" and "spec design".

I have not worked at a place where the workflow was so holy and important that it couldn't be captured in a near default JIRA install

JIRA out-of-the-box is surprisingly sane - most people who hate JIRA really hate the custom workflows their own organisation has inflicted on them.

Making a serious bug tracker is hundreds, more like thousands of man hours. You’re saying that instead of investing 50 hours into configuring Jira properly, it’s better to do that? Even with intern work, it doesn’t make sense, especially since interns will be gone tomorrow ;)

I think your either over complicating the requirements or under estimating how long it takes to throw together a system with half a dozen tables and a a dozen views/forms. Or underestimating just how far you can get with such a simple system. Then you've got something tailored to your workflow.

> Even with intern work, it doesn’t make sense, especially since interns will be gone tomorrow ;)

But there will be a new batch sooner or later that can extend and update it.

Most places only need a fraction of the functionality of Jira.

Most developers underestimate the complexity of any project :)

Especially one that becomes a crucial component of a company's workflow (as is the case for bug trackers in any properly run software company).

I would say its more of the case that developers unnecessarily make projects more complex than needed. Certainly on the last couple of projects I have inherited.

This is why BPM systems exist. To take all of those disparate “Good enough” tools that don’t fit your workflow and make them fit your workflow.

Really a tech stack that doesn’t get nearly enough attention IMO.

I don't think that's the whole picture. You could copy an interface (and even implementation) and fork it. But often with NIH we see a reinvention of a technology without even looking at alternatives; often this ends badly (in Linux land much more often than not it seems).

The problem is that reading (and learning from) code is hard. The other problem is that there is so much bad code out there (most of my own is certainly not an exception) that it gets even harder.

And it's not about the code. Writing code is easy. It's about finding the right problems first. And then it's about finding the right abstractions. The easiest way to build a clean conceptual world is to start with a clean slate and ask yourself before introducing new code, "does this code solve a concrete problem? Do I really need it?"

Unix had simple and clear ideas, and it has many reimplementations (NIH!), and most are not that bad - are they?

I gave him mine on your behalf :)

A description of -- not a name for -- is corporate control of an open source project. Trying to keep up with Angular or Kubernetes or Go is not a problem at the Google from whence they come because the technologies are primarily used by and hence primarily designed for use by teams. Not individual developers.

A team's brain can schedule a time where part of it is training or studying while the rest of it is making progress on the task at hand. A team's brain can resolve ambiguity and complexity more easily because it has multiple human brains and those brains have different strengths and experiences.

A single developer can't duplicate a brain. A small team of developers can't replicate the larger multi-team brain of a Google org-chart. Google org-charts are the context for which its tools are designed. The same is true of other corporate open source technology bundles like React and AWS.

People kick themselves for not grokking technologies like Kuberentes without realizing that it wasn't really designed for their use case, isn't documented to be easy to pick up, and isn't managed with the cognitive load of individuals in mind. The time scales around which these technologies are designed are FTE-months or years.

If a two pizza team can get up to speed in a month, it means an individual will probably take about a year. Given equal cognitive efficiency.

I disagree on Go.

Go is well designed and takes upgrading into consideration. Upgrades very rarely break anything. You should just be able to recompile that 1.3 code on 1.8, etc.

On the other hand, dependency management in go in any sane way is such a byzantine task that it takes ages to get working, and even then it still doesn't.

Given an empty set of environment variables, setting up a new go project with dependencies and all in a way that people cloning it later on can also use the dependencies in a single command, without vendoring dependencies, without building custom scripts, is impossible.

It's impossible with first party tools, but I've had a great experience with Glide.

That said, go drives me up a wall. Using interface{} to muck with JSON and typecast every step of the way is just my nightmare. It's inelegant in every way

Type cast every step of the way? Methinks you are either doing something wrong or dealing with unstructured json.

The pain of dealing with unstructured json/mapping types is exactly what I'm referring to. :)

That said, I don't think the language as a whole lends itself towards elegance - that's pretty much the opposite of their goals. I don't agree that their model for simplicity is the only (or best) model for simplicity is all.

I concur with the sentiment.

The risk is still there for dependencies, but it helps that the community for the most part follows "a little copying is better than a little dependency" as an adage.

These are excellent points and very interesting food-for-thought, the 'organizational thinking' that leads to the tools being terribly complex and steep to learn for individuals.

But don't these tools have product managers, and don't product managers create tools to serve customers?

The only cynical viewpoint that I can think of for artificial (or lazy) complexity is consulting dollars.

Google does consulting now?

They do, but I think most of their consulting is driven by a desire to enable other organizations on their tech (hence buying more).

So definitely more helpful than the "obfuscate + delay = profit" model of most consulting agencies.


If you spend enough they always have.

Been my building worry as i have watched the ongoing changes is the Linux ecosystem in recent years.

It feels like thereis a echo chamber happening, where a small group of peope working for maybe 2-3 companies have decided that their vision is the correct vision.

And everyone that says otherwise are haters and fossils.

Well, hundreds of medium and large companies now rely critically on Linux so it was expected that slowly they will cooperate to steer its further evolution.

People do respond that the whole codebase is open source and hence if some really bad change happens the community can always fork from the last known good commit. But this ignores a subtler issue: soon enough so many of the critical subsystems and so many of their contributions will come from individuals directly or indirectly working for corps that if/when those subsystems start adversely impacting the original Linux philosophy, stripping them all out and starting over again would be not much better than abandoning the whole platform.

This is exactly why alternative efforts that seek to maintain working and up-to date alternative init systems, display managers and desktops are critical to the long-term health of Linux.

Linux thrives on multiplicity of choices, which is anathema for corps who always want to consolidate. But that road eventually will weaken Linux for the general community and varied use cases.

Joel Spolsky talks about this (as a somewhat adversarial tactic, even if it's not purposely so) in "Fire and Motion"[1]

"Watch out when your competition fires at you. Do they just want to force you to keep busy reacting to their volleys, so you can’t move forward?

Think of the history of data access strategies to come out of Microsoft. ODBC, RDO, DAO, ADO, OLEDB, now ADO.NET – All New! Are these technological imperatives? The result of an incompetent design group that needs to reinvent data access every goddamn year? (That’s probably it, actually.) But the end result is just cover fire. The competition has no choice but to spend all their time porting and keeping up, time that they can’t spend writing new features. Look closely at the software landscape. The companies that do well are the ones who rely least on big companies and don’t have to spend all their cycles catching up and reimplementing and fixing bugs that crop up only on Windows XP. The companies who stumble are the ones who spend too much time reading tea leaves to figure out the future direction of Microsoft. People get worried about .NET and decide to rewrite their whole architecture for .NET because they think they have to. Microsoft is shooting at you, and it’s just cover fire so that they can move forward and you can’t, because this is how the game is played, Bubby. Are you going to support Hailstorm? SOAP? RDF? Are you supporting it because your customers need it, or because someone is firing at you and you feel like you have to respond? The sales teams of the big companies understand cover fire. They go into their customers and say, “OK, you don’t have to buy from us. Buy from the best vendor. But make sure that you get a product that supports (XML / SOAP / CDE / J2EE) because otherwise you’ll be Locked In The Trunk.” Then when the little companies try to sell into that account, all they hear is obedient CTOs parrotting “Do you have J2EE?” And they have to waste all their time building in J2EE even if it doesn’t really make any sales, and gives them no opportunity to distinguish themselves. It’s a checkbox feature — you do it because you need the checkbox saying you have it, but nobody will use it or needs it. And it’s cover fire.

[1] https://www.joelonsoftware.com/2002/01/06/fire-and-motion/

This post made be nostalgic for that period which started in 2002-2003 (just after the dot-com crash ) up to 2008-2009 (when FB and Twitter emerged and when Google "changed" its skin to a full ad company), when everything worth reading was being published on blogs, when people (myself included) still believed in the open web (we were all very busy bashing SOAP) and when it wasn't all about IPOs and earning obscene amounts of money (it's also the period when this website was put online and when its founder was just a respected blogger, LISP-er and former Yahoo employee). Good times.

Absolutely agree. Its been ages since I used by RSS aggregator to read blogs the way I used to.

Hell is an undocumented SOAP interface.

I still have nightmares about trying to grok it.

Some of this is caused by lack of backwards compatibility too. In the past once you picked a library you could be pretty sure that the API's you depend on wouldn't change much, and then only with a major release which would come with adequate documentation detailing the important changes.

These days the throw away and rewrite/refactor crowd is in such power that if you don't spend 1/2 your time tracking the commits your likely to find that your dependencies are entirely incompatible with whatever you have built on them and you have no idea how to fix that short of reading the last 6 months of mailing-list (whatever) postings just to change the "runlevel" or get your service to start...

The dependency trees in many projects are out of control too.

It was part of the reason I wanted to move on from my last position. We were spending so much time chasing infrastructure dependency changes that it began to feel like a treadmill that was slowly increasing speed. At some point I got tired and couldn't find the heart to debug yet another vagrant up failure, knowing that fixing it will probably break something else, and the actual project work was just as likely to suffer from the same issue.

At home I keep it simple, and my advice to a sole developer or small team looking at some of these infrastructure platforms and tools thinking they need them: these tools are not for your use case.

You don't need vagrant, docker, kubernetes to manage a 4 person dev team and you are just burning hours and building the chain of brittle tooling that will be your biggest pain point until you eventually hire a sadist/devops guy.

> until you eventually hire a sadist/devops guy.

I try to drop in and follow the devops scene from time to time; after all, a lot of my work ends up getting handled by this stuff, and sometimes you need to fix things yourself. From my point of view, listening to a devops talk is listening to a long monologue consisting of a chain of mostly food-related, seemingly unconnected English words ("Chef Cucumber Puppet Jenkins Salt") that somehow "run" on top of another. None of them are modular, none of them are interchangeable, there are no standards, whatever you write for one of the food-related items will not work with another food-related item, or the same food-related item of a different vintage. The shelf-life of the food-related items seems to be about 3 years.

From an outsider's perspective it feels that, aside from Nix, there has not been any theoretic or standards progress in this field since Mark Burgess' time.

You've given an insightful assessment of the orchestration space -- a modular DIY treadmill nightmare...

Currently it seems Kubernetes is emerging from the fog as a de-facto standard to eliminate all those uncertainty points and provide a common toolset. It's the only orchestration platform that's a turnkey installtion and a turnkey managed service on all 3 major clouds. VMWare is about to open k8s up to the Enterprise market too with a integrated & managed on-prem service [btw VMWare employees: your PKS blog and cloud blog say this is happening mid december 2017... status update, maybe?...]...

With a shared deployment pane across the cloud and on-prem, wrapped in drag-and-drop tools from the major VM suppliers and sprinkled with "self-updating" magic and QOL improvements in the next few years, I believe we're at a watershed moment for a shared standard :)

Google is pretty good at tricking the industry into training techs for their internal stack.

> btw VMWare employees: your PKS blog and cloud blog say this is happening mid december 2017... status update, maybe?...

I'm at Pivotal, we're working on it with VMware and Google. If you want to kick some of the tires, start playing with CFCR, because that's a major piece of PKS.

> From an outsider's perspective it feels that, aside from Nix, there has not been any theoretic or standards progress in this field since Mark Burgess' time.

Because there wasn't much progress indeed. IT operations is a field advanced by programmers, and most programmers have very little experience with (or even interest in) system administration, so the progress of the field is mainly governed by clueless outsiders oblivious to all the tools and mechanisms already in use. That's why it's a huge pile of mess.

The thing that kills me is the preconcevied we have to use $TOOL behaviors. Lets use docker as a deployment tool, where we are putting one container on each VM... Combined with the complete ignorance of packaging. 99% of the problems I hear people trying to solve with docker would be better solved with a 'postinstall' hook in a package called 'my_configuration' (it helps you with configuration versioning and in place upgrades!!!!). Combined with the complete lack of understanding what something like kickstart/autoYast can do.

It seems that in a huge number of cases, its possible to trim out the vast majority of the layers of deployment/etc crap with just a bit of KISS and using tools which are already installed...

Interesting! Do you have any recommended books or tutorials on this approach?

You mean something like your distribution's documentation on building packages?

More like a "how-to and why for 99% of Docker use cases" tutorial I can point people to next time I hear the words "we need to use Docker."

The short story version of this was written by Arthur C Clarke in 1951, "Superiority" http://www.mayofamily.com/RLM/txt_Clarke_Superiority.html

Fantastic, thanks.

Thanks for that! Good story!

I was just thinking the other day about "finished software". These days, those Unix philosophy tools of doing one thing well and leaving small solved problems alone are becoming seemingly fewer and fewer.

There are definitely client app trends away from the Unix philosophy... I would argue that's a product of the success of the Unix philosophy, though. New apps are developed in a world where 'curl' or 'grep' exist, they can move on to more specific needs.

On the platform front I believe this philosophy has recently 'won the war': Microsoft was forced to create Windows Subsystem for Linux (WSL) as a compatibility layer to access exactly that rich tool ecosystem and it's server & production oriented workflow... Cross platform means "not windows", even on windows.

Up a few abstraction levels though, and we can see that philosophy dominating in cloud space... "Micro-services" are API enabled single-use tools focused on doing 'one thing well' and yielding coordination responsibility to higher level applications and not assuming as little as possible about their end-use. Tool silos to support new unforeseen use-cases.

Even more emblematic of the Unix philosophy in cloud space is the emergence and growing popularity of "serverless" solutions: hyper focused single-use tools directly integrated into the computing environment. A single function, pumping between cloud services or transforming some text, "freed" from infrastructure.

In days of old we had to build up the foundations -- text manipulation, process stats, diff capabilities -- but based on that amazing ecosystem the new generations of those tools are freed to focus on new problems and speak a more abstract and higher level language -- APIs, event queues, NLP services... Just like how some primates started grunting, and then they grunted numbers, and then they made satellites and then conquered mars with robots. Shoulders of giants, and all that :)

I dunno.

WSL seems to be more about the success of the Linux kernel API than any philosophy.

Indeed. The cathedrals won.

I would posit(x !) that it's because it is easier to form a community and 'cost of entrance' around a megalith like Kubernetes, than around individual tools that do 1 or a few very similar things well.

Then again, I would say it's time to start looking away from "handling text streams", to something that can handle data streams of various formats, including transcoders that can convert from one type to another.

I think people have a limit of the number of "things" they can learn too. One big thing is one thing, 10 tiny things are 10 things. 10 things you have to learn, and then figure out the optimal (or tolerable) way for them to all work together.

The original unix utilities had the benefit of (compared to now) being in a relatively simple environment, and a (much more) relatively small developer community, and a clear shared architecture understanding between and among utility and systems designers. They almost end up being more different commands or subsystems of one 'thing' then a 'thing' of their own. This is very hard to do. Unix succeeded because it was succesful at it.

Once, for instance, you say "oh yeah, we want just like this, except not just text streams", as you suggest -- it gets even harder. Especially in today's environment which is not that of the unix origin.

I don't necessarily mind relying on unfinished software; but it is tiring to rely on something that doesn't appear to be on the path to being finished. Projects with a complicated upgrade cycle and a rapidly moving upgrade treadmill are definitely a no-go.

> It's like the authority to create goes from the individual first-principles (by necessity) maker, to the control over development being in the hands of an external group, and then all your time is spent keeping up with what they're doing.

This is a fundamental economic question: buy or build? Buying involves search costs ("all your time is spent keeping up"), building involves the cost of ... building.

Often the line shifts because of gains from trade/specialisation and the deepening structure of production. As products become more featuresome, it becomes more economical for producers to specialise in part of the problem. They become better at that part than others are.

It quickly becomes impossible for any single producer to out-produce the combined output of specialists.

As a side note, the tension between the cost of searching and the cost of doing it yourself is believed to be why firms can emerge out of "pure" markets. It also suggests an economic reason for why software projects grow at their margin and another reason for why NIH is so attractive.

Same thing in deep learning. I already had to dump half of my code because of theano, as well as api changes in keras and tensorflow that made it a pain to load some of my old models in newer versions of the frameworks.

Now I'm rewriting a bunch of it in pytorch while anxiously waiting for 0.4 to come out (no idea when) and break all of my models again.

There aught to be a name ..

I think that "vendor lock-in" comes close to describing the situation. However, it does not describe the way the dependence changes the behaviour of the people locked in. I think that "addiction" fits in this case.

That's not lock-in, it's just a dependency.

It would be lock-in if the vendor made it unreasonably difficult to remove the dependency.

I think a lot of this is that once a tool reaches a certain level of size and complexity, it’s impossible to know all the technical details unless you’re actually a developer on the project (and after another threshold, not even then). So at that point you have to use news and social information to find out what’s going on, and just trust that people know what they’re doing.

It can be frustrating, but it’s also unavoidable. While I strive to know as much as possible about my tools, if I had to know every part of the stack backwards and forwards, I’d never get anything done. You do have to cede to the abstractions at some point.

The important distinction here is the pace. I can use a Ubuntu server LTS version and not have to stay on top of weekly development release announcements from upstream.

With k8s lacking LTS, it forces you to drink from the firehose.

In a sense LTS is provided by vendors such as Gravitational and Red Hat. They introduce hysteresis and smooth out the bumps on the upgrade cycle.

> It can be frustrating, but it’s also unavoidable.

Why unavoidable? It's not that one is forced to use some tech... Even though the hyped crowd chants the names so loudly.

A lot of things that complex technology offers are not really needed (not to mention one has to constantly fight with that new complex technology when it doesn't yet do something). And complex tech can be frequently replaced with a small set of tiny scripts and small and simple standalone services.

Domain and deployment-specific scripts and services that are unique, sure. But still easy to grasp mentally (and much smaller than any mature project's codebase).

(I haven't read the rest of your post, but I think I can still answer this little snippet...)

> Why unavoidable? It's not that one is forced to use some tech... Even though the hyped crowd chants the names so loudly.

Peer pressure. Consultant-hiring-pressure, for example: Are you full-stack? "Full stack", now there's a phrase...

(EDIT: I honestly do feel that it's a sort of an attack by the sheer meta on the hiring side of the equation on the meta of the supply side of the equation. I don't imagine recruiters are doing much of this consciously, but it's definitely happening more and more...)

I think there is a difference between knowing the tech enough to list in in a resume, and actually using it.

I sort of know K8s. I have experimented with it, have set up a few smaller but nonetheless "real" clusters, I have ran various apps on them, have broken them with various sorts of fault conditions, have tried to repair them. Without this I wouldn't be able convince myself my opinion of K8s is even remotely close to being proper and informed one (okay, it is still not but it's closer), and not solely based on my preferences and beliefs. And this is also why I'm not using K8s in production anywhere I need to design, set up and manage a system that I should be generally able to reliably repair within hours (or faster).

To interviewers who just want to hear some important keywords, I can tell some tales how I ran that fancy mixed-arch amd64+armhf cluster for my CI. Most likely, omitting the fact that this was a personal project and that in the end I somehow broke it and despite spending a pair of evenings had failed to figure out what went wrong with the networking... so I've just scraped it and switched to Docker Swarm. But if I'd to actually design a system that I would maintain, I'd go with whatever I believe is actually technically appropriate and not with a mauve SQL database[1]. And I believe I would be able to justify my choices, explain the trade-offs, requirements and so on. Unless it's on GKE or it's not my responsibility to manage the cluster - then I can go with K8s, no problem here - it wouldn't be a lie to say that I "know it and have some experience running apps on it, but have usually preferred other solutions". ;)

[1] http://dilbert.com/strip/1995-11-17

> I think there is a difference between knowing the tech enough to list in in a resume, and actually using it.

Of course, but the difference doesn't tend to come across on a recruiter's spreadsheet. Nowadays, I only go by personal recommendations/references.

Just as an aside, and as an offer of XP: Tech A is largely irrelevant unless you have to execute in, say, 1 month. Success almost never depends on the tech -- it mostly depends on people.

> Success almost never depends on the tech -- it mostly depends on people.

Of course, that's true. But my understanding is, the technology choice is not about succeeding or failing - a good team with good plans can succeed with just about any tech that can somehow do the job. It's about costs. Complex but featureful tech may drive development costs down, but may also result in significantly higher maintenance costs.

"Yeah, he's a few PDF's shy of a full stack, I tell ya!"

side story:

Yeah, I went through 3 stages of hiring with $company. Interviews went great, and then I hear back that I didn't show up to the 3rd (much to my surprise). When the HR person checked back in, they said it was an error and instead they were no longer interested in me because I lacked Kubernetes experience specifically.

And my resume includes Openstack and Apache Mesos. Not Kuber. But they dragged me around. Still pisses me off that they couldn't read my resume. Then again, their interview process was... shall we say, "interesting".

> Yeah, I went through 3 stages of hiring with $company.

My BS radar goes off when I get asked for a 2nd interview, or any 4 hour group interview. Thanks, but no thanks. They did you a favor.

These interviewers are crazy, and have zero proof their convoluted process produces better hires than a single 30 minute coffee interview. They just don't want to admit they don't know what they are doing. As if the heavens will open up and god himself will shout down "He's the one!", if they just have enough rounds to get there.

My girlfriend went through ten interviews with Stryker and then didn't get the job. Words fail me.

I've also been interviewee recently in this space. I'm lucky in that in each one the interviewer showed a healthy scepticism or lower enthusiasm for K8s and similar products.

Maybe I'm doing my self a dis-service by not shooting for the moon. But I don't think so. I'd rather do real work than spend days reading upgrade notes. On-premises K8s time will come but for most that time is not now.

Thats why one must lie in their cv

Perhaps. But I think it's better, that if they're willing to play those kinds of games that I wouldn't want to work for them anyways.

All tech companies run their own NIH stack in one way or another. If they aren't willing to admit that to themselves, there's not much I can do. It costs time to train to get new people up to speed... and given what kind of arduous journey learning OpenStack is, I figured they would have understood and went "Well you dont know K8N, but you're trainable in this stack.."

I just call it an ad. It is after all the core competency of the prominent stewards of open source.

The software is the ad, the tangential service is the product; keeping up is the infinite sales call that requires no calls.

This is the price of free.

There aught to be a name to the tendency that as tools get better and better, the more your time goes from having your mind in technical-space to social and news-space.

It seems like a form of bike shedding mixed with busy work. Nearly all of it takes away from the actual product being built.

It's usually framed as the Build Vs Buy question. When do you stop your search for an adequate ready-made solution and just build it yourself?

Is there anything that can be done about this other than to just accept it I wonder.

This reads like a giant ad for GKE. It emphasizes several times to just use GKE for pretty lame reasons (Google has good SREs and Google started the project).

The people that work on upstream k8s in Google (Tim et al) have a pretty limited overlap with the Google Cloud people that run GKE. Upstream k8s is a full time job so they are most certainly not spending their time also writing internal GKE code.

I don't have an issue with GKE, but this article uses little evidence to recommend it when it seems the conclusion should have been "maintaining a k8s cluster requires a full time sysadmin. If your company has a culture of pretending sysadmins are pointless, then you should pay another company offering k8s sysadmin-as-a-service hosted on their hardware."

> The people that work on upstream k8s in Google (Tim et al) have a pretty limited overlap with the Google Cloud people that run GKE.

I happen to be in the building so I sat in on an engineering review for a proposed new GKE feature today. Tim was there, as was Brian Grant, and pretty much everyone else you've heard of (in the room or on the VC). There were also a good number of PMs, SREs, and the leads/engineers working on the GKE machinery.

Thus I have to respectfully disagree.

Engineering review is not the same thing as reviewing and writing code. I've seen countless projects with sound designs riddled with implementation bugs due to people not fully understanding the intent of certain systems.

Sitting in a room during a high level engineering review falls squarely under "limited overlap".

Well, you can trust me that it doesn't, or you can send me your CV for a referral.

Plus, there's this: https://www.quora.com/What-is-Googles-internal-code-review-p.... But you'll just complain that the OWNERS aren't the right people, and I'll just comment that they are.

Can confirm, this is my experience as well.

Author here. I do have a bias toward GKE but that is after trying to use everything else -- DIY, kubespray, OpenShift, Juju, Tectonic, Azure, Fargate, kops, Rancher v2 preview, kubeadm, etc. In my experience nothing (yet) has been as nice and clean as Google Cloud.

I am not sure if GKE can be directly compared with Openshift/Kops/Tectonic. One is hosted service and other is packaged software with DIY setup. Surely you are paying extra money for what you are getting.

Curious why swarm isn't in that list.

How many services / nodes are you managing?

I found the docker swarm learning curve to be very sharp but plateaued super quick, maybe a month of reading docs and bug reports then I was up to speed and don't really run into day to day issues.

Once your prod is setup and you've worked out all the hard decisions around stack and how to do things, getting Devs up to speed is really easy because all they need to know is how to read the docker compose reference.

Even less of any issue learning everything now too because they redesigned their documentation (wording etc as well) and it's much clearer, gotchas are explained etc.

Agreed. I absolutely love Swarm and think they have done a great job of an integrated solution.

However, I'm keen to see if Swarm becomes a later on top of Kubernetes instead of a separate piece if infrastructure.

CFCR[0], née Kubo, might be worth adding to the list if you're doing rounds.

Disclosure: I work for Pivotal, we work on CFCR with Google and VMware.

[0] https://docs-cfcr.cfapps.io/

Have you tried EKS?

EKS = Amazon Kubernetes.

(I had to look it up)

Have not, was not invited to the preview. Friends behind bootkube hinted that the EKS team seamed to be heading in directions that were odd relative to upstream. Happy to try it and be pleasantly surprised!

The GKE reference is a lot more relevant than you think, it's just a bit deeper below the surface.

Google's relationship with k8s and GKE and in particular their motivation for creating an open-source orchestration service is well documented and frequently retold, but ironically appears to not be well-understood.

With any sufficiently complex compute solution, you're going to need both a platform and some sort of management. Typically vendors sell you their management service (for which you often pay a recurring fee) by locking you into a proprietary platform. What's more, the management service can be dismal but you'll pay anyway because your app is built on their platform. It happens everywhere, the industry is full of examples, and anyone could see that it was going to hit Cloud sooner rather than later. And the result would be ... bad ... for everyone (customers and providers alike) but the dominant platform.

So Google took an exceptionally gutsy gamble. Google excels at platform management. They invented SRE. They posited that in a level playing field, all else being equal, Google would have the best platform management service, and people would pay for it based on its merits alone.

By making k8s open source, they leveled the playing field. It was by far the best tool for the job, and everybody could have it at no cost; neither money nor burden. So there was no business model or justification for building a lock-in sub-par alternative. The platform became a commodity; nobody could own it.

Which is where we are now, and what this article depicts. You still need a management strategy, k8s doesn't change that. But now everyone's management service has to compete on its own merits. Nobody gets to play the lock-in game.

Google's management service is GKE. It's good, but not because Google started Kubernetes. Rather, Google started Kubernetes because Google has the expertise and experience to make GKE good.

I see it, somewhat cynically, as an advanced competitive strategy.

At some point something will become "the standard". Up to that point Google will have to be making huge investment in their own tooling to operate at their scale...

If "the standard" is someone elses, they're playing second fiddle in the big picture. If "the standard" is different than their own tooling they're paying a premium in dollars and a premium in talent-time to train new hires.

Giving away their tooling and supporting it to become a 'best of breed' solution outsources their training costs and 'onboarding' time to the greater industry. Facebook, MS, and VMWare are paying people to get good at Google tech and selling Googles tech in the Entprise. Their open source strategy also ensures, as you said, a 'level playing field'. A level playing field where they are guaranteed to not get locked out, and where where their size, strengths, and deep competencies of product and domain give them a massive competitive advantage.

It's smart on a lot of levels.

>They invented SRE.

Let's not get dramatic and drink too much of the kool-aid. SREs existed well before Google albeit under a different name.

I dunno, if it's an ad for GKE, then "you typically have only about nine months until the latest version goes out of support" is indeed pretty persuasive marketing to me to let someone else deal with it.

This is a showstopper for me.

All our best clients are on a 12 or 24 month upgrade cycles. I need to be confident I can deliver them the latest version of their project knowing there's only regular security updates required over then next 1-2 years.

While I'd love to be doing continuous deployment and multiple production feature deliveries a day - that's not how the business I work at makes their money.

I can live with Ubuntu's 5 year LTS policy, I can't work with a "you might have to do significant rework after just 9 months" platform.

After watching the deployment team trying to keep up with Docker and their ridiculous pace of breaking changes, and then eventual decision to move entirely away from Docker for containers; I really understand their decision not to want to touch k8s with a 10 ft pole.

If the choice is between "building features based around business needs so other teams can make more awesome stuff" and "throwing lots of dev time at just chasing docker/k8s/whatever", why on earth would they want to choose the latter?

Especially cause then you've got to jump 3 significant versions, to get another 9 months of reprieve.

I think it's important to judge the support model in context of what kubernetes is...

As an API based cloud-management platform Kubernetes has a robust, stable, core topped with multiple layers of abstractions that build on, and use, one another. There are relatively few dead-ends in the projects history, and those have all tend to be superseded by a much nicer, much smarter, much easier abstractions.

An operating system going out of support after 9 months is a total show stopper, naturally. But Kubernetes runs on top of those 'stable' layers. It's stability is in its primitives and the physical API contracts. The primitives being introduces today don't magically impact my production cluster, and because kubernetes is built like a russian nesting doll of APIs there is very little incentive for the project to make any changes, much less hasty changes, much less poorly thought out changes, to the core APIs. Unlike many other dependencies the kubernetes changes impact deployment and management, but will rarely impact application-level concerns. If it worked in old kubernetes, it's generally gonna work in new kubernetes.

I feel absolutely no support-pressure to upgrade my on-prem installations when my deployments are cross-compatible with the updated versions on my cloud providers.

I feel a lot of developer-giddiness-pressure to upgrade my on-prem installations because me and my devs want the cool new things they're baking into the platform...

From my experience with k8s the issues are all related to running a linux cluster that uses docker and iptables -- a pretty unavoidable pain to run docker containers on a linux cluster, IMO. Support in this context broadly means a stable API, and a stable core, and Kubernetes has had those for a while now.

> The people that work on upstream k8s in Google (Tim et al) have a pretty limited overlap with the Google Cloud people that run GKE. Upstream k8s is a full time job so they are most certainly not spending their time also writing internal GKE code.

This statement is absolutely false. There is no upstream team separate from Google Cloud team. There is only the GKE team.

Full disclosure: I do a lot of work with cloud providers and GCloud is one of my partners.

I think the article covers a very real issue with the pace of k8s. GKE is particularly good at staying updated. There are other providers that are good like Stackpoint Cloud.

If I were going to make an ad for GKE I'd point out that they probably have the most robust internal fiber network of any cloud. AWS tries to push packets onto the public internet as fast as possible.

The load balancer you get with GKE is seriously overpowered and under priced (which makes me love it).

Maybe EKS will be amazing (but we don't have evidence yet). Azure is getting pretty good but they're a bit late to the party. IBM isn't bad but they've struggled the most with staying up-to-date.

Haven't more recent versions of k8s (say 1.9+), dropped some self-upgrading capabilities to smooth the pain of keeping klusters kurrent?

I'm a few versions back on my clusters, but my understanding was that some decent work was being done to smooth the particular pain point of keeping pace with the project :)

Every time a new framework or tool comes out and everybody jumps on it. I always wonder if somebody will realize that you are trading one set of problems and work for another.

As engineers we really need to stop supporting these sort of effort and take the time to help each other become better engineers that write and maintain our own code. We need to promote learning and mastering the underlying concepts that things like kubernetes tries to hide and shield engineers from.

In most cases tools like kubernetes are so vast and huge so they can be the solution looking for many problems.

It is also curious how once kubernetes became big how many small shops needed "Google" level orchestration to manage a hand full of systems. And how hard people ripped their software stack apart into many many micro services just to increase the container count.

I think if most engineers took a step back and said "I don't know" and took some time to truly understand the requirements of the project they are working on they would find a truly elegant and maintainable solution that did not require tweaking your problem to fit a given solution.

Every tool and library / dependency you add to your solution is only adding more code and complications that you will not be a expert in and one day will find your self at the whim of the provider.

Far to often do we include tens of thousands of lines of code of somebody else's work all for a handful of lines of code that if somebody would have had the confidence and support from other engineers around to try and truly understand the problem domain could have implemented and owned the solution.

The general trend I see as I get older is that we are valuing the first to a solution rather than a more correct solution. Only to be stuck with a solution that requires constant care and work around.

So I plead to all engineers, devlopers, programmers or whatever you call your self. Please stop and take a moment and think hard about how you would solve any given problem without the use of external code first. Then compare your solution to the off the shelf "solution looking for a problem". You might surprise your self.

I will also like to point out that if when solving a problem your solution looks like a shopping list of third party tools libraries and services; you might not fully understand the problem domain.

-- sorry for the rant --

I don't think your comment applies at all to Kubernetes.

K8s truly simplifies dev-ops and even the smallest team and website can greatly benefit from it.

I speak from 7 years of experience managing my company's infrastructure's website:

Before kubernetes, I ran my company's stack on very cheap bare metal from OVH. It was great while it lasted, but as the company grew and my team grew, it's become harder and harder to maintain this infrastructure. And I'm not only talking about our production servers. In reality you have to maintain your prod, your staging, and, worst of all, your local development environment.

Over time, your production staging and dev all become entirely heterogeneous. Each environment ends up running totally different/incompatible versions of all your stack's softwares, and no amount of Ansible/SaltStack/Puppet script will save you from this. All those scripts become a nightmare to maintain and your infrastructure as a whole becomes brittle with, for example, bugs happening only in production, but never on your local dev environment.

K8s came as a savior to all my issues: I burned all my old ansible scripts and rewrote all my infrastructure in k8s. Now my prod, staging and dev env are 99% the same. It saves me a tremendous amount of time and headaches. I taught my team how to use minikube and create a replica of our production with one command line on their local computer.

K8s is far from being just a trendy buzzwordy shiny new cool toy to play with. It solves real world problems that dev ops have. I am so glad this technology exists and I hope to never have to go back to writing ansible/puppet/whatever scripts.

> Over time, your production staging and dev all become entirely heterogeneous.

I've found that this is simply the reality, excepting test-only environments. Developers need to be able to run the applications out of an IDE, compiled with special instrumentation, or whatever else they need to do. It's not possible to support every combination of what they need.

Also, there's some significant developer overhead sometimes. I had another team give me a set of VMs for their product, so I could test a new config. It was great in that it was just like production, but not so great in that it was just like production. How do I log into these VMs? How does authentication work? Where are things located on the VMs? How do I generate an updated VM? You end up having to train everyone in dev ops, because you've handed them a copy of production.

I hadn't considered the benefits for dev/local instances...

I've known about Kubernetes for some time, but my current job never deploys anything that Kubernetes could improve, so I put it aside and hoped to someday get a chance to toy with it.

I've setup vagrant images pre-loaded with our app for several non-developers to use locally but it sounds like Kubernetes would be a far better way to manage those as well as staging servers.

Unfortunately my current company is 100% against third-party hosting/involvement so I couldn't even use it for staging - our stagings servers are Windows-based, ancient, and internal...

For me (small startup) this is the killer feature of k8s. I'm not operating at a scale where "cluster scheduling" is a thing I need to care about, though self-healing and load-balanced services are nice.

To be able to stand up an exact copy of my application on a dev machine, or even better in a review app per-branch (complete with DNS entry and TLS cert) is incredibly valuable. You can run through essentially all of the deploy pipeline before even merging, including SSL config tests etc.

In addition to the dev benefits, there is also a built in cloud scaling story and strategy.

It's not like every app needs that kind of robustness, but there is a certain calming security in knowing that if any part of your Kubernetes deployed app actually needs to go "web scale", or someone asks about 100 times the users you have ever considered, that the answer is straightforward and reasonably pre-configured.

Kubernetes can't run the local images for your devs (Kubernetes must operate over a cluster; minikube spins up a virtual one). You're just thinking of containers.

I saw another comment that did this just a minute ago. Is k8s-hysteria getting so out of control that it's consuming Docker whole? Seems like it may be.


HN's limitation on my post rate has kicked in. Response to gouggoug follows this note, since I'm not allowed to post it.

This rate limit was originally installed to make me stop speaking against the kool-aid Kubernetes hivemind and now it's filling its purpose quite well. See this thread for the original infraction: https://news.ycombinator.com/item?id=14453705 . After the fact, dang has justified the rate limit by saying I was engaging in a flame war. Read the offending thread and judge for yourselves.

Remember, YC doesn't want you to ask if you need Kubernetes or not. They just want you to use it. If someone on HN says otherwise too frequently, they'll rate limit that person's account, as they've done to mine.

Doesn't matter if you have 10 years of history on the site. Doesn't matter if you have 10k+ karma. Only matters that you're counteracting the narrative that Google is paying a lot of money to push.

No matter how frilly and new-age someone makes themselves out to be, people only have so much tolerance for argument when there's money, power, and prestige on the line. HN is no exception. There's an inverse correlation between the credibility of counter-arguments and the urgency of the situation; crazy stuff won't get much retaliatory fire because most people can tell it's crazy, but non-crazy stuff that counteracts their goals will be pushed down, because most people can tell it's not crazy.


He says that he wants Kubernetes to replace a local Vagrant image. Kubernetes doesn't replace Vagrant. To replace Vagrant, he would want Docker, rkt, etc., not Kubernetes. Kubernetes solves a different problem. Yet he says that he wants to try Kubernetes to fix the problem that Kubernetes doesn't fix.

k8s and Vagrant address wholly separate concerns (where to run things rather than how to run things). The poster I replied to is conflating Kubernetes and Docker, the underlying containers that do the actual execution.

> Maybe some less advanced users using k8s don't realize that it heavily uses docker (or rkt, or whatever container runtime you could think of), but how is that an issue?

How is it not an issue? Is it OK for developers to not know the difference between a compiler and an IDE now? A web server and a browser? A computer case and a CPU? A network card and a modem? These things are not mere details, even if they are often used together. Technical professionals who can't differentiate between these aren't just "less advanced users", they're posers.

k8s is a huge chunk of crap to throw in between you and your applications. One should, at the very least, have an accurate high-level idea of what it does before they go around telling everyone that they need it.

I'm not sure what your comment is about. I'll glance over the first 3 sentences since I'm not sure at all what you are trying to say and jump directly to the fourth one:

> Is k8s-hysteria getting so out of control that it's consuming Docker whole? Seems like it may be.

This confuses me the most. K8s and Docker are complementary technologies, not in opposition to each other. Maybe some less advanced users using k8s don't realize that it heavily uses docker (or rkt, or whatever container runtime you could think of), but how is that an issue? That doesn't mean there's a k8s-hysteria going on.

> even the smallest team and website can greatly benefit from it

I don't think that's true. You have to be at least large enough to justify reasonable investment in microservices.

True, but the investment pays off at about the 4th executable running.

> I don't think your comment applies at all to Kubernetes.

It does. I've deployed and maintained our company's infrastructure on Kubernetes for over a year now.

> K8s came as a savior to all my issues: I burned all my old ansible scripts and rewrote all my infrastructure in k8s.

This doesn't make any sense.

Ansible scripts out changes to make to a system. Kubernetes deals only with opaque images and does not change them at all, it simply runs them.

>K8s is far from being just a trendy buzzwordy shiny new cool toy to play with. It solves real world problems that dev ops have. I am so glad this technology exists and I hope to never have to go back to writing ansible/puppet/whatever scripts.

I'm not really sure how to reply to this without being accused of bad faith, but I'll just reiterate again, your post does not make sense because it is talking about tools that do different things. It's like saying you love hammers and hope to never use wrenches again.

That said, Kubernetes is a shiny buzzword that is much overvalued.

> Ansible scripts out changes to make to a system. Kubernetes deals only with opaque images and does not change them at all, it simply runs them.

You are exactly right.

Ansible makes the best effort possible to bring your system to a given state (the state you coded in Ansible). All those tools (puppet/salt/ansible) do this exact same thing, and they all manage to do it more or less well.

However, the keyword here is "best effort". That is, it is in practice really hard to consistently bring a system from a random state to a given state X, because of the randomness of your starting state.

Kubernetes doesn't do that, it just manages the lifecycle of your state and makes sure that things run. As an added bonus, it allows you to inject some external configuration to your system, and some other "cool stuff".

You build your state in terms of container images, that once they've been built are by matter of fact set in stone. You then instruct K8s to run all those images.

That, to me, is much more powerful than "scripting out changes to a system".

My scripts are always unreliable and run inconsistently because I make mistakes. In contrast, my container images always run the same way, be they scheduled on my dev machine running mac os, or on my GCE cluster, or my microsoft azure nodes.

> your post does not make sense because it is talking about tools that do different things. It's like saying you love hammers and hope to never use wrenches again.

Those tools k8s and ansible/puppet/etc set out to do the _same thing_. That's the nuance here.

They all set out to bring your infrastructure to the state you programmed.

It so happens that k8s (and more precisely container images) are much better at keeping at consistent state for obvious reasons (container images are state).

K8s is just the cherry on the "container technology cake". It schedules things and makes sure they run, for you.

You weren't too far off with your hammer analogy, but it's more like: ansible is a manual saw and it gives me many blisters, K8s is a chainsaw.

> However, the keyword here is "best effort". That is, it is in practice really hard to consistently bring a system from a random state to a given state X, because of the randomness of your starting state.

This is the logic that BOSH follows. Why try to converge to a desired state if you can just recreate it from a known-good state?

Why do you care about grooming individual servers when your goal is a distributed system?

Configuration management tools make the classical sysadmin's life much easier. "I have a small group of giant expensive machines to run". The constant struggle of impinging chaos on the systems that Must Not Stop Ever.

You still have to define the state to solidify, whether that state is represented by a Docker image or not. That means you still have to script the system before you can crystallize it.

Any mistakes you make are still there, so the fact that your scripts are mistake-prone doesn't change anything either way. k8s doesn't help you there, it just adds a nice thick new layer of stuff to break.

Snapshotting state in opaque binary "images" via Docker layers/images is a different thing than constructing that state. You can't take a picture of something that doesn't exist. You can, and possibly should, use Docker and appropriate configuration management mechanisms together, but you definitely shouldn't pretend that they address the same issues.

Your systems scripted with configuration management will deploy the same way on any VM, local or cloud, and Docker. They all start from a base image, whether it's VDI, AMI, a Docker image, or whatever. It is true that Docker makes it faster to load different states than the other systems, but this has nothing to do with configuration management's role, and it also doesn't come for free; there are tradeoffs to consider here, as in anything else.

There are several on this forum who feel that their paychecks depend on the widespread adoption of k8s and/or Docker, and at least some have the ears of the moderators. I'd say at this point, you've revealed enough of your mindset, experience, and intention to make it clear that there is no point going further into this, so let's leave it here and move on. Especially since I'm not going to be allowed to reply, because YC specifically and intentionally doesn't allow k8s skeptics to post very much. Why is that an issue that HN has to create an artificial consensus around? Hmm...

I agreed with you until the conspiracy theory. Sort of a self-fulfilling thing, that.

There was no theory until my account was rate-limited for posts suggesting that people not use databases on Kubernetes. These were ruled too "tedious" by dang; they rate-limited my account and marked my post at [0] as off-topic. Maybe I just missed the part of the guidelines that said not to be too dry when discussing container orchestration.

It's not really self-fulfilling if it's already happened; I think that's just called "information about an event".

I think my view that humans are imperfect (yes, even the super-fancy humans at YC, who do have a horse in this race as investors in Docker Inc.) and will censure things they dislike is plenty justifiable given the events. That's especially the case if these people are being pressured to take that action by other high-status individuals.

For the record, dang has explicitly disclaimed my theory, rather suggesting that asking people not to run their database in Kubernetes constitutes a flame war. I don't have the link to that handy but I'm sure it wouldn't be hard to find for an interested party.

The reader is free to ascribe motives as they see fit.

[0] https://news.ycombinator.com/item?id=14453705

The thing about k8s is it sort of forces you into best practices, immutable images are built and run instead of ansible or config management duct-taping them together. So he replaced his ansible playbooks that build his apps with Dockerfiles

How are you deploying and maintaining your k8s clusters? There is some bootstrapping for nodes but k8s is distro and cloud agnostic... Stateless golang binaries plus etcd. It seems like you should be singing it's praises in that regard on the provisioning front?

Funny enough there has been some work on "self hosted kubernetes" where even the components themselves are inside k8s. Pretty cool will probably be the future of cluster bootstrapping: https://github.com/kubernetes-incubator/bootkube

I'm curious for your reasons for saying k8s is overvalued. You are the first person I have seen that has worked with it and came away with this opinion!

> So he replaced his ansible playbooks that build his apps with Dockerfiles

Docker is not part of Kubernetes. This is what I was talking about. The benefits of Docker/containers are not inherent benefits of Kubernetes. It is important that we attribute things to the correct platform.

For scripting the system, Dockerfiles are obviously inadequate, that shouldn't take much explanation. I discussed this more at [0]. Ansible (and I assume other config management tools) can be used to invoke image building. [1]

You can use and benefit from containers without Kubernetes. Kubernetes is about running software over a cluster of anonymous hardware resources. Some applications are well-suited to that, and some aren't.

> I'm curious for your reasons for saying k8s is overvalued. You are the first person I have seen that has worked with it and came away with this opinion!

First, being overvalued doesn't mean it holds no value. There are use cases for which Kubernetes is well-suited, but it is just not a good solution for many types of applications, and people are (rather dangerously) diving in to this system head-first without taking a step back to appreciate its implications.

It's very similar to the way in which everyone pounced on Mongo and ended up regretting it when they had cause to ask, "Wait, what's a transaction?" [2]

In short, efficient use of Kubernetes requires software without masters, controllers, or other stateful components. It needs software that can be vaporized and rematerialized on command and continue humming along happy and safe. While it's true that, in theory, many services have had this as a requirement given the elastic behaviors of web applications, there are still frequently manual steps and/or performance problems associated with bringing nodes up or down.

Very little software has been written in a truly stateless manner, and some things will never really fully assimilate that model, because it doesn't make any sense for them to do so (databases).

New things that are greenfield and written specifically for deployment on Kubernetes shouldn't have this problem (though I would expect many of them do), but that means that Kubernetes is not going to be much of a benefit for the vast majority of existing applications, which is what people are expecting to get from it. k8s would be much less popular if "you need to redevelop a lot of your code to really take advantage of most of these features". Thus, it's overvalued.

Again, it looks like people are starting to conflate the underlying container runtimes with Kubernetes. Note that the orchestration and scheduling benefits of Kubernetes are separate from potential benefits gained from containerization.

This once again shows that the group that controls the user interface controls the platform, and ironically, it's why k8s won't be what Google hoped; now that Azure and Amazon are offering k8s-as-a-service, people will remain glued to the Azure and Amazon interface, feel pride in their acquisition of a new buzzword, and Google will have gained no significant market share.

> You are the first person I have seen that has worked with it and came away with this opinion!

Yeah, I wonder if this is truly the case, or if others are just more prudent than me and don't want to say things that will make others dismiss them as philistines. :P

[0] https://news.ycombinator.com/item?id=16240500

[1] http://docs.ansible.com/ansible/latest/docker_image_module.h...

[2] http://hackingdistributed.com/2014/04/06/another-one-bites-t...

Thanks for your feedback, you do have some good points.

> Docker is not part of Kubernetes. This is what I was talking about.

Yes, perhaps I should have said Dockerfiles + k8s Manifests

> In short, efficient use of Kubernetes requires software without masters, controllers, or other stateful components.

StatefulSets + Persistent volumes solve this quite well: https://kubernetes.io/docs/concepts/workloads/controllers/st...

K8s is becoming (if not already) the default cloud native "OS". Write a manifest for your application and people can easily run it anywhere. The big clouds offering K8S as a service only improve this!

> and no amount of Ansible/SaltStack/Puppet script will save you from this.

And I am afrade you missed my point entirely if you are bring up yet again more tools and frameworks.

Owning your stack from the top to the bottom - with very few exceptions is what I am suggesting.

I too also speak from experience, and have a few years on you -- not that any of that matters.

If your environments diverged then you did not own your stack properly. It is within every engineers ability to build out a stack that runs the same on each environment. It just means understanding that problem domain and taking the time to write the code that fits.

Far too many times have I smacked the hand of a coworker suggesting using any of these bail wire and duct tape solutions you just mentioned. The end result is we have a wonderful system that we own and can ensure meets our needs. The same binaries, configurations and images are pushed from alpha to staging and finally production with full control of evey artifact at the start of the pipeline.

I'm curious. Could you describe the kind of infrastructure you manage and the tools do you use? Are all your tools written in house? If so, what will happen when the author leaves the company?

> -- sorry for the rant --

It's not unreasonable, but there is a gradient. Linux was a toy, now it's probably involved with handling more real commerce than any other software ever written. Javascript-land was a riot of colour and decay, but basically React and Webpack have won the dominant position.

Evolutionary explosions don't last forever. Deciding to not pick winners is smart early on. But once the ecosystem settles down and there are clear winners, it's time to accept the new normal.

If I sit down to write a web app, I do not first create my own programming language. I do not write an OS. I do not develop a database, or a web server, or a transport-layer protocol. I take the ones that exist off the shelf and use those.

This is only possible because some options are so dominant that they have driven nearly everything out. By doing so they attract to themselves the overwhelming share of effort and attention. They grow a rich cast of supporting systems, they get bug fixes sooner, they serve as the jumping-off points for the most exciting new possibilities.

We are past the stage where "I will write my own cloud platform" is a defensible position. I believe it past about 2 years ago, actually.

>We need to promote learning and mastering the underlying concepts

Should we do the same thing for programming languages? Make sure people are learning and mastering assembly?

>Only to be stuck with a solution that requires constant care and work around.

I've seen this attitude from people who didn't want to use ansible/chef/puppet because their shell scripts were "good enough". Like the shell scripts didn't require constant care..

You should master any subject you were hired for to do work in.

I find it hard to equate deploying and maintaining your production environments to assembly.

> I've seen this attitude from people who didn't want to use ansible/chef/puppet because their shell scripts were "good enough".

That attitude is also quite myopic.

Not to condescend, but if one isn't working on cloud-native apps in federated environments then maybe maybe maybe one should be intellectually humble enough to allow that there are some challenges people face that make "good enough" a non-starter, prohibitively expensive, or require direct competition with AWS/MS/Google in their core competencies.

Everything is a spectrum. Serious work can be done on a single server with no backups... Shell scripts can carry serious work far... Ansible or Docker Swarm rock for what they are... but the multi-cloud orchestration platform we're discussing rocks a oodles of things those solutions aren't.

Well it is a good thing I did not suggest shell scripts, ansible or puppet. I suggested owning it stack. Which means wiring code and tools to suite your needs.

Deployment and upgrades should be built in - as in it is part of your product. Not some afterthought.

It's odd how the two replies have read have jumped to conclusions about what one might use if they were not using kubernetes.

Perhaps if you shared some more concrete detail about what it means to "own the stack" people would be less inclined to fill in the blank in a way you hadn't intended to communicate?

> Deployment and upgrades should be built in - as in it is part of your product. Not some afterthought.

Kubernetes is one of several products that offer environment agnostic mechanisms to handle deployment and upgrades.

If you're dealing with hundreds of 'products', or a mountain of microservices, or large batch job scheduling, putting the lot of it on Kubernetes is addressing deploying, upgrading, monitoring, securing, discovering, and many other things up front in a shared, consistent, and portable manner.

A forethought, not an afterthought...

I agree that owning your core systems is valuable. Focusing on core IT competencies and business value creation are more valuable, though, and if you're not dealing with multi-cloud-platform resource management I can't imagine how to justify writing tools that compete with established, mature, industry standard solutions. You're always better to focus on your unique advantages and competencies... If you're ok using an RDBMS system in your stack, then using a scheduling/upgrading framework should be pretty ok, too.

> I think if most engineers took a step back and said "I don't know" and took some time to truly understand the requirements of the project they are working on they would find a truly elegant and maintainable solution that did not require tweaking your problem to fit a given solution.

> The general trend I see as I get older is that we are valuing the first to a solution rather than a more correct solution. Only to be stuck with a solution that requires constant care and work around.

This, 1000 times over. Right now I'm trying to figure out how to turn a project around that started without me and has gone down the path of using an inadequate (but mildly popular) open source tool instead of building something that's actually designed to solve our problem, not someone else's.

K8s is different though.

It's an incarnation of a highly-effective & efficient infrastructure paradigm, verified in over a decade long serving of entire google's compute tasks.

It's not really new.

Disclaimer: I work in Google's compute infrastructure team.

There is clearly a lot of unfinished tooling and features, as evidenced by the extremely high commit rate.

Compare that to an old piece of infrastructure like "mount" or "chroot".

Is it too much to ask for base infrastructure to be boring?

Granted, I have nothing against the Kubernetes project and it might be fantastic. I don't think it's wise to encourage people to use/learn it at this point, at least for production projects.

Everything is anything to certain degree.

As a core compute system, k8s is more sufficient to be a candidate for everyone to learn than any thing in the same category on earth.

> The absolute safest place to run Kubernetes application is still Google’s GKE.

Interesting to read given I recently had a GKE cluster auto-upgrade its master version from 1.6.x to 1.7.x at 8pm one night (however foolish it was to not be subscribed to the release notes RSS Feed[1]), which somehow caused a cascading nightmare of things breaking.

Logs stopped appearing in the GCP logging interface and in our own log parsing pipeline, a bug related to a change in the format of the yaml specifications meant that all the containers got stuck in a broken limbo state as we tried to upgrade the nodes (yay for googling the error and finding open github issues!), and then once we'd manually fixed all of our deployments and were finally able to get our nodes rolled over to the new k8s version, all of our load balancers started intermittently timing out until they were deleted and recreated.

Surely we need to keep a closer eye on the release cycle, and we're guilty of bandwagonning onto this cool new tech, but boy does it suck to get auto-upgraded at night only to discover several breaking changes while in emergency mode trying to fix things.

(1) https://cloud.google.com/kubernetes-engine/release-notes

This article concerns me, especially considering the first thing you see is "There is no such thing as Kubernetes LTS (and that’s fantastic)".

What is so great about running your infrastructure on a platform that has no intention of ensuring long-term stability? Irregardless of how well backward-compatibility is maintained, the idea that we should all move our infrastructure to something that lacks the fundamental promise of "updating won't break everything" seems downright irresponsible.

LTS releases aren't really about stability or upgradability; in practice I think they're more often used as an excuse to never upgrade because the risk of applying 2-5 years of changes at once is too high. It sounds like k8s is trying to nudge people into a more continuous deployment model.

The last thing I want from an external supplier of something I depend on is to have them driving my development and release cycle. I'll expend the resources to upgrade to your latest version when it's worth my time and cost.

Edit: I'm speaking to the vendor here, not you, of course...

And you can indeed just sit around on an older version. The vendor in this case just won't bother doing stuff to help you.

> What is so great about running your infrastructure on a platform that has no intention of ensuring long-term stability?

Kubernetes uses API versioning and takes backward compatibility very seriously for stable APIs.

Regarding LTS, these discussions are happening and increasing in number as time passes. Example: https://www.youtube.com/watch?v=fXBjA2hH-CQ

That's a really important review that is not showing up in the rest of this lengthy discussion.

The only way frequent upgrades are possibly sane is if you take backwards compat very very very seriously, and succeed at doing so. If users of k8 think it's doing that, that's important info.

(I think many of us are lately getting burned by a lot of open source software that _doesn't_ succeed very well at backwards compat (whether they're trying or not, sometimes it's not clear), _and_ expects you to upgrade all the time).

Author here. The point is that the stable APIs are rock solid, the design of its API versioning and the community's pace of delivery is a huge part of why it's all maturing so quickly.

That's all well and good until it isn't anymore. And to top it off, your recommendation is to either hire someone to manage k8s full-time or be at the mercy of your cloud vendor and hope that one day you don't turn up to work to find that the new update you weren't aware was being rolled out has just broken everything.

You can't ask a large-scale company to take a dice-roll on k8s for their infrastructure when the only promise of long-term reliability is "trust us not to make any breaking changes".

I think you should have leaned on that point harder and more explicitly for the readers that were not already k8 users. :)

Just trying to keep up with the shifting sands of web programming is a headache in itself. And now we have people wanting to build our whole infrastructure on yet more shifting sand. Argh.

No LTS is fantastic -- as long as you don't expect me to put it in production. ;)

k8s is a tale of woeful, hilarious levels of platform immaturity jammed down the public's throat by the Google PR machine. I believe Kubernetes is a centerpiece in Google's cloud play against Amazon (I don't think it will be successful), and that this consideration drives a lot of the particular weirdness and distortion that exists around it.

It'd be a whole different ball game if they discouraged production use and told people "We can't even commit to keeping this up to date for one single year, whereas all other enterprise vendors offer 5 years or more, you'd be silly to think this was ready for prime time." But they don't, because then you don't have starry-eyed brogrammers saying "DUDE, we gotta get on k8s so we can be like my homies in Mountain View, aw man, it's so awesome! It says Google Cloud is the best way to get Kubernetes, call 'em up!"

I feel sorry for anyone who was naive enough to say "Well, if Google does it, so do I!" and jumped headfirst into this bear trap, making themselves collateral damage in a pissing contest between two megacorps.

Source: I work with people who decided to move our entire infrastructure to Kubernetes. We have now run virtually everything on k8s for over a year. I'd feel sorry for them, but they left the company some time later, so now I just feel sorry for me. :)

>"There is no such thing as Kubernetes LTS (and that’s fantastic)"


Technically OpenShift isn’t LTS yet, it’s 1 year and 3 year (the latter is special extended support). Right now we’re planning on 1.9 being our first LTS with the longer support commitment (3yr?). Still being sorted out.

We won’t be doing the 10 year support like for RHEL 4/5/6 yet.

As a guy who likes the stability traditional vendors provide as apposed to the madness that is chasing hip upstream developers, the mere fact that your working on providing LTS guarantees gives you a significant advantage over your competitors.

Looking forward to what Openshift will bring to the table!

This is a problem I faced using ansible, webpack + js modules, and more. it's a moving ground and you always need to keep up to date with the latest changes often which are breaking.

I tend to design my systems so that they work even if I haven't touched a line in a year but when using such tools it's always a pain.

I wish things were as stable as a bourne shell and unix environment in general. Not that they achieve the same but I just miss the stability I get from raw unix tools.

Github is already littered with the dead carcasses of software using shiny new obsolete tools like gulp, grunt, and browserify to name just a tiny few.

You have to have a strong nose for bitrot and ruthlessly filter garbage out of the firehose, to work in web development today. If that package or tool hasn't been updated in 8 months, do you really want to learn it (and force your team to learn it), since it's practically abandonware already?

It's a self-reinforcing cycle. You don't want to put your organization at existential risk by betting on the wrong horse. But there is no choice. Unless you want to get stuck having to hire Perl guys in an ocean of node ninja rockstars. Your company will be the ugly girl that didn't get asked to prom night. The pariah of Silicon Valley.

> You have to have a strong nose for bitrot and ruthlessly filter garbage out of the firehose, to work in web development today.

This is a big test for the development community, and I don't know how it resolves. Are discerning coders who've seen this fad cycle before just going to have hold their noses and pretend to be gung-ho about the latest fad, or do they stick to their principles and say "Yeah, I'd use that if we had a good reason, but I just don't know many..."?

Bear in mind, doing the latter just means some idiot who's read six lines of a blog post from TechCrunch is going to come in there repeating "blockchain" and "Kubernetes" non-stop until they get hired on the spot.

We have a lot of people who've been misled into believing "copy Google and magical fairy unicorns will fly out of your butt" crawling all over this community, including a lot of under-experienced developers themselves who apparently don't realize that they're reaching outside of their comfort zone with this.

It's very hard to find a company that isn't trying to get Kubernetes-ified as fast as they can. If you go into an interview and say that you're so-so on Kubernetes, it'd be like saying you don't like Node in 2014 or that you don't like Mongo in 2012 -- you're out the door. Every company needs to be like the cool kids and get the magic of the Great Googly Kubernetes smeared on themselves.

This is of course very silly and bad from a technical standpoint, but it seems to primarily be about social standing. That's why something like EKS is so good. They just fork over some cash to Amazon and then they can tell everyone that they too use Kubernetes, and all of Google's hard-fought marketing effort goes down the tubes, since users are still glued to Amazon's platform.

true but you don’t have to go that far. React, webpack, and sometimes well maintained packages that depend on unmaintained packages suffer from this. I am not picking on web dev, it’s a general problem.

It's a relatively new problem caused by an influx of inexperienced programmers who have to experience these problems first-hand.

They do learn, but they can't teach what they learned because of the ever increasing influx of new, unexperienced programmers.

I highly recommend this talk by Uncle Bob (have a tea and a foot bath, it's long): https://www.youtube.com/watch?v=ecIWPzGEbFc

I don't think it's just the influx of new developers, it's too many rockstars that only work on green fields applications, they can't and don't learn the long term consequences of their decisions. Not all of them are new programmers, they're just the epitome of 5 times 2 years of experience.

I wasn't around for it, but I bet you'd say the same thing about Bourne Shell when it was new.

The tools will stabilize as they mature, they're still very new comparatively.

Or... as they "mature", people will start talking about how kubernetes is for old crusty engineers over 27 who just don't want to learn anything new, why don't they just learn ganymandias which is much more modern.

Just to be clear I am not insinuating that new tools and their developers are inferior. But backward compatibility is also very important.

When I am using version 1.9 stable release of a product i’d expect more stability

trigger warning: bitter jaded ops person working in a real company

"[...] users are expected to stay “reasonably up-to-date with versions of Kubernetes they use in production.” [...] the upstream Kubernetes community only aims to support up to three “minor” Kubernetes version at a time. [...] if you deployed a Kubernetes 1.6 soon after it came out last March, you were expected to upgrade to 1.7 within roughly nine months or by the time 1.9 shipped in mid-December."

Jesus christ this is so annoying.

Businesses don't have a couple hundred billion dollars sitting around to spend on engineers to look at release notes, compare changes, write new features, write new test cases, fix bugs, and push to prod, every 3 months, just to keep existing functionality for orchestrating their containers.

We have LTS because businesses (and individuals) don't want to have to do the above. They just want a reliable tool. They want the ability to say that if a bug is found in 3 years, it will be fixed, and they can just keep using the tool.

We don't give a crap about "Kubernetes’ domination of the distributed infrastructure world". We don't want to use Kubernetes. We just want an orchestration tool - commodified tooling. We want to stop caring about what we're running. We just want the fucking thing to work, and to not have to jump through hoops for it to work.

"Moving Kubernetes Workloads to New Clusters instead of Upgrading"

UGH. We only do this for bastardized unholy stupid shit like OpenStack. Not only is this not fun, it takes forever (you try moving 50 different clients off the service they've been using for three years), and you have to have duplicate resources. What the fuck is the point of cloud computing and containers and all this bullshit if I have to have double the infrastructure and juggle how it's all used just to upgrade some fucking software?!??!?!

"The Kubernetes-as-a-Service offerings, particularly Google Cloud’s Kubernetes Engine (GKE), are the well-polished bellwethers of what is currently the most stable and production-worthy version of Kubernetes."

Oh. We're supposed to pay Google to run it for us.

....I'm just going to use AWS.

I was going to reply to the general point, but I got interested on this:

> you try moving 50 different clients off the service they've been using for three years

This sounds like a major red flag.

Are they on-prem air gapped servers? Do you manage the infrastructure? Because if you do and they aren't, they are not supposed to see it. And if it is automated, what does it matter if there are 50 or 50 thousand clients?

The point of the cloud is that the customers don't have to care. And you also shouldn't care about individual servers. Or even clusters.

On K8s, it does a lot of really neat stuff that would keep me awake at night otherwise. Oh a K8s worker VM died in AWS. I don't care. There's a machine that's misbehaving. Kill it, another one will come, containers will spawn there. I don't care. Need more container instances, done. Need to deploy a new service? Push the new yaml to the server, done.

Upgrade twice a year and you'll be ok, unless your dev team insists on new features.

> they are not supposed to see it

There's a whole lot of things that are not supposed to happen in the tech world, that happen all the time. Most people do not run brand new stateless microservices in K8s. They run stateful apps on VMs in three year old badly configured OpenStack clusters.

For what it's worth, moving the client traffic itself is seamless and totally fine. But first you have to get them to stop hard-coding the names of region and zone specific load balancers into their apps. You switch them over... traffic falls over... and it's your fault. Hence, big moves are coordinated with clients, not to mention feature tested, and then tested by sending some real traffic to the new cluster, before you shift all the traffic. The cloud is not magic, and if you treat it that way, it will bite you.

I also wouldn't be surprised if an external vendor will build LTS releases for a fee. That's a pretty common and successful model that would solve this problem easily.

Sounds like full employment theorem at work. I haven't kicked the tires on kubernetes yet but I don't really see what all the fuss is about. I liked AMPLab and Mesos but I guess that doesn't have the branding power of big G.

100% the case. k8s is not a terrible thing and there are good uses for it, but the vast majority of people who are using it don't understand what it's doing, and don't understand that their application is not equipped to handle that type of execution model.

Just yesterday I had someone tell me "Kubernetes was like magic and it made everything easy". This is absolutely not the thing a serious/honest Kubernetes user would say. Kubernetes is a big heavy thing, and it's a lot of trouble to maintain it; generally, far more trouble than using sane configuration management on VMs would be (you need sane configuration management in k8s too, so you aren't swapping one complexity for another). We're reaching "MongoDB is web scale" levels of hysteria around Kubernetes.

Example of the insanity: my account on HN has been censured for the "tedious" nature in which I would assert that Kubernetes is not a good platform for databases.

There is a serious, concerted effort by Google to put k8s at the forefront, and they are not playing games with it.

I'd also say "kubernetes is magic". It's an insanity to understand and set up once, but once you've done that, and keep up with changes, it makes deploying services much easier.

For example, in the past I first ran several services directly on bare metal, but Ubuntu's LTS versions didn't have the packages yet (and libraries were incompatible) for one service I needed, and while debian had these, it didn't have them for a second service I needed.

So I went with containerization, and properly split these things up, so I could run all services I needed together. But I had two servers, and manually migrating them whenever I needed to restart one server for updates became painful. And it made automated updates impossible, so every ~3 days a crontab would run apt to update, and if a reboot was necessary, email me. So every 3 days I had to manually migrate containers over, and restart. Very annoying. Plus all the ingress rules complications.

Then I started using kubernetes, and despite needing 8 months to actually understand it, and work with it, it actually is magic. I don't have to manually check every 3 days if apt fucked again with the packages. I don't have to check if I need to reboot, and migrate containers. Thanks to k8s and container linux everything migrates automatically and reboots. I don't have to worry about all this stuff anymore, it just works. Yes, I have to worry about k8s updates, and keep my configs up to date — but that's another amazing thing: recently I had to rebuild my cluster from scratch, and thanks to container linux's cloud-init and kubernetes I could simply restart all servers with the new config, and they'd automatically recreate the cluster, and load the storage data back from backups. In 25 minutes the entire cluster was rebuilt from scratch and everything was back up.

Yes, kubernetes is the opposite of easy for setting it up, but that's a constant cost. All additional services you run on it are basically free in terms of maintenance cost.

> There is a serious, concerted effort by Google to put k8s at the forefront, and they are not playing games with it.

Well no, and why would they? Containerisation is one way to crowbar workloads out of VMs and AWS. Since they're coming from behind, their best strategy is to deny everyone else any oxygen by creating an opensource winner. And it worked: Amazon have added EKS to ECS, Azure added AKS to ACI.

Google does a lot of good, but they don't sink millions upon millions of dollars into things just for the hell of it.

In general they have been pretty aloof at product marketing and development. Microsoft came in with a weaker product early on and has really been cleaning up with Azure going head to head with Amazon and GCE.

G as a whole has always been pretty much the opposite of Apple in terms of fit and finish, and in the past it was no mongo in terms of dev swoon. But I agree with cookiecaper, they have somehow cracked the code for mongo level of installing meme based software architecture on the masses. I would _love_ to understand how that works, for selfish entrepreneurial reasons :)

I agree that Google is coming from behind in enterprise sales. AWS have the long headstart and Microsoft have an existing, massive sales org.

Cloud computing fits almost none of their DNA. It requires intensive sales rather than automation. It's a volume business with thin margins (rather than a network-effect business with fat margins).

But it's also a business in which they have the best technology out of the three and the first plausible alternative revenue stream they've ever found. I don't think they are going to stop elbowing their way into this, and neither are Microsoft.

The difference is, Google is the only company that is both trying to sell cloud services, and giving their product away to their competitors.

The company I work for uses multiple cloud providers, but does not use GCE. They don't even think about it. I think the reason is a lack of confidence, along with other business and technical reasons. Google just isn't a serious company when it comes to supporting large businesses. Like it or not, that old adage about never getting fired for buying IBM is still true.

Kubernetes isn't the product, though. GCP is. Google doesn't care what software you're running, as long as you run it with them.

I work for Pivotal, which with IBM wrote a container scheduler (Diego) of similar vintage.

Nobody has the branding power Google has amongst developers.

I think this is a symptom of the "release often" philosophy. With yearly or longer releases you could actually keep up and read the release notes. With stuff being released several times a year it's too much work to keep up unless you are deeply into it at the moment.

I notice the same with my Android apps. I used to read release notes of new versions but now I have it on auto update and am sometimes surprised that an app I have been using all the time has completely changed and I don't know how to use it anymore.

No, it is the symptom of a high velocity project. If Kubernetes were to have yearly releases, the list of changes would be four times as long and upgrade path would be a major leap instead of four minor steps.

I think a major leap is much easier to handle and to plan for.

I completely disagree. The complexity and risk of a change goes up with the square of the size of the change. Having moved from upgrading when forced to (almost) continuous upgrading, the number of moving parts in any given change is small and the frequency means we become skilled at rolling out changes safely.

That's fine and dandy if you can afford to put up the resources to keep your deployment up to date. For more constrained others, an LTS release can be the deference between using the project vs not.

Kubernetes’ governance is becoming like Openstack and (I know this is controversial), I hate Openstack, especially because it tried so hard to be “AWS” compatible, and APIs are so awkward to use.

Cloudfoundry is better in terms of governance and project’s direction. Many of the main developers work full time at Pivtoal. But it is hard to run your own CF without significant investment like access management and “painless” upgrade (etcd is a pain in the entire CF stack in my experience). Though I have to admit the project is moving in the right direction in the past year or so.

Kuberentes/CNCF governance is completely different than Openstack. There's a reason every major cloud provider is involved in CNCF versus Openstack. You can see all stats for CNCF projects here, i.e., https://k8s.devstats.cncf.io/dashboard/db/contributing-compa...

CFF was setup in a completely different way, giving Pivotal a lot of control in the beginning by allowing related entities to have votes and than relinquishing that over time. It leads to a more single vendor controlled ecosystem IMHO.

There's pros and cons to both approaches.

Disclosure: I help run CNCF.

Pivotal's strong influence over CFF decisions comes from the fact that votes are assigned according to how many fulltime engineers you devote to Foundation work. Pivotal has more fulltime engineers on CFF projects than anyone else.

There are, as you note, pros and cons. It made sense at the time, as I think there were concerns about vendor politics getting in the way of developing the thing.

I guess one of these days we'll smash the CFF and CNCF together and stagger out with some kind of bicameral system, just for laughs.

Disclosure: I work for Pivotal. Trolling is just my hobby.

> Cloudfoundry is better in terms of governance and project’s direction.

The Cloud Foundry Foundation rules are very different from CNCF's and intentionally take DNA from Pivotal Labs. It has strengths and weaknesses.

> etcd is a pain in the entire CF stack in my experience

It's either been removed from CFAR, or is close to it, I lost track. A lot of time was spent before it was decided that etcd doesn't play nicely with BOSH.

It's come back into Cloud Foundry land via CFCR, as a Kubernetes dependency. Very nostalgic.

Combining my brief time trying to put together a proof of concept with Kubernetes and reading articles like this, I'm so glad I chose Hashicorp's Nomad. It's simpler to configure, more versatile (shell scripts, executables, and Docker containers) and a decent third party UI - HashiUI. With Consul, configuration is dead simple.

Dr Dobb's articles below reflect a somewhat similar feeling

Just Let Me Code http://www.drdobbs.com/tools/just-let-me-code/240168735

Getting Back To Coding http://www.drdobbs.com/architecture-and-design/getting-back-...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact