Having built a bunch of these, the article more or less skips past the hard stuff. When you build an internal PaaS you aren't building the k8s part, you're building and prescribing the entire developer workflow to the team. If you do it well then it makes self-service super easy but the opinions on how to do everything are baked in. For example, you get opinionated versions of a build pipeline, SPA hosting, deployment models, yaml templating (or some other config language), dependency trees, monitoring and alerting systems, identity, secrets management, etc. All of those things end up being baked into the system if it's going to be successful. It's a tall order and I'd love to read more about internal PaaS failures than successes; thats where the learning is going to be.
I agree that the main challenge is to get the developer workflows right and being prescrptive here often should not be a problem from the developers' perspective.
And yes, it would be great to generally hear more about failures but naturally people are less willing to share these stories, I guess.
This seems relatively well written, but I consider the cost analysis worse than a dead fish handshake. It's just plain facile to explain in one sentence that it took two years of development to bring an internal tool into service, and then in another suggest that it is cost effective without any kind of rational assignment of figures; as well without comparison of the same environment without the tool.
I see your point. The problem with exact numbers is that they are hardly (publicly) available as on the one side, there is the cost of building and running the platform, and on the other, there is potentially lost productivity (which is generally hard to measure). So, for the 2-year development example, I could only rely on the KubeCon talk, and it just mentions costs briefly (saying that it was cheaper for dev teams to use the platform).
Another problem is that it can be very different depending on your situation: If you are currently working with local clusters that are free to use, the direct cost benefits will be much lower than if you use individual clusters for each developer, which are quite expensive even for small teams.
So, I just wanted to give a rough overview of what that drives cost and where you potentially could save money.
People who don't write code should not dictate what the dev experience in a company should be. Kludging up some YAML that you run directly in Prod from your laptop is not writing code, Im sorry to say. The problem is that SREs are tasked with solving these problems and they have horrible taste and sensibilities around dev experience because they are not devs.
All these remote dev Kubernetes solutions sound good in theory, but suck royally in practice. Remote debugging just isn't as good and the idea that I have to be online to get any work done is such a step back that it's baffling that anyone could even propose this.
Most developers just want to work on tickets other non infra related tasks.
For the majority of those contributors its really helpful to give them a happy path. This happy path also makes supporting much easier.
This happy path comes out as a platform and series of tools on top of the k8s layer.
I love working on infra automation, and I love enabling other teams. When I did nothing but dev work the feedback loop for when I helped someone could be months. As an SRE I help dozens of people every week with near real-time feedback.
I feel like this completely misses the security and compliance reasons - it may in fact be easier to build a Kubernetes platform internally than to do all the work necessary to safely host confidential/HIPAA/mission critical information in the public cloud.
And of course some organisations do both - public cloud for new stuff which can be secure from day 1, and internal platforms for the keys to the kingdom or legacy stuff.
Considering how many security controls my FISMA Moderate-classed SaaS inherits from my cloud vendor, I very much doubt it's easier to build any platform internally, if we're only considering regulatory compliance.
There’s paperwork compliance, and then there’s security and risk controls performance validated compliance.
It may be more difficult if “we signed a contrast, trust the compliance report” is not an acceptable answer for a particular risk management audit or regulator.
If we're in "we can't rely the 3PAO's assessment or the JAB's (or DISA's) review of this cloud vendor" territory, then we're probably dealing with workloads far more sensitive than FISMA High or Secret, in which case it absolutely makes sense to DIY.
At first I thought Kubernetes was a massive overkill. However as I learn more about it, I’ve been a big fan of having a local minikube cluster for dev, jsonnet for template wrangling, and being able to ship to stage and prod the same built images.
Once you get over the learning curve it makes a ton of sense why k8s is growing like wildfire.
Because it works ? I did that for our business unit to fully leverage our internal network infrastructure (SDN) , so I could have pods as first-class citizens working along with components deployed on bare metal, without all that vxlan/calico fuzz ; and did some other enhancements, all in less than 1000 lines of Python and Go.