
The Rise of Platform Engineering - godelmachine
https://softwareengineeringdaily.com/2020/02/13/setting-the-stage-for-platform-engineering/
======
abhayb
All infra teams eventually become platforms. All product teams eventually
become experiences. When viewed negatively this is called scope creep. I don't
know what it's called when viewed positively but I expect the word "holistic"
to be used unironically.

Org charts that ship a platform are default stable because everybody it a team
or group is doing approximately the same things. Growth is less uncomfortable,
advancement feels more objective, and individual developers are relatively
interchangeable.

But what if a company needs to change? Now the stable org chart resists that
change. By rejecting requests from client teams that are responsible for a new
set of objectives. This recurses. One layer of platform can simultaneously be
moving too slowly for the layer above and too quickly for the one below. Shear
forces tear it apart and the organization finds itself with n (3 < n < 6)
fewer platform engineers.

~~~
user5994461
In my experience having a formal platform structure helps tremendously to roll
new solutions. When the platform officially provides something, everybody in
the company can make use of it right away. Unlike the pet project that's being
pushed by a random manager.

Breakthrough can be pushed by rolling a second library/framework/platform.
Like AWS ELB and ALB. Then developers can adopt the later if it's so much
greater, but they won't because it's 90% of the same and who wants to work on
migrations?

Large organizations are fundamentally split apart. First part of the org wants
A and B. Second part wants B and C. The developer team next floor is rolling
their own thing to do C and D. All while features A and C are incompatible so
it's impossible to satisfy everyone. There is no solution to resolve internal
conflicts (except maybe reducing a large company to 20% of its current
workforce).

------
mr_tristan
My mind went immediately to a Netflix presentation about "the role of central
teams in devops" a couple of years ago:
[https://www.youtube.com/watch?v=xA1qPQg3WTc](https://www.youtube.com/watch?v=xA1qPQg3WTc)

The big thing I remember about their approach that surprised everyone: there
was no mandate to use the platform/central team's tools. It made me chuckle
how many times the presenter was grilled by everyone about that. It was like
some of the audience straight up thought he was lying about that.

But basically, if you have a platform team, and a mandate to use that team's
tools, well, the other teams aren't really "customers", in the sense you can
leverage choice as a signal. So you have to make up for that. In my
experience, you need very good management, and constant, multifaceted
communication. Which... might work, might not.

It's probably best to delay any kind of centralized/platform work until you
have a _very_ clear pain that defines a very clear set of roles and
requirements. Unless everyone says "Oh shit this is _amazing_ " ... just say
no.

~~~
closeparen
Mandate makes more sense the lower in the stack you go.

“This is _the_ cluster scheduler you must use to run code in _the_
datacenters” Ok.

“Since part of your feature communicates with end users, it must be
implemented in this visual programming language we created for workflows that
interact with end users.” Not fine. This is how you end up with abominations
of workarounds on workarounds. Let me use my regular tools and give me a damn
API. Your decision to invent a shitty half baked programming environment is
not my problem. When you try to make it my problem you are only creating more
damage.

~~~
user5994461
It doesn't make sense at the datacenter level either.

How are ops/developers supposed to use ansible or salt or docker or
kubernetes, when the only available solution to access/deploy to the servers
is with the one centrally approved tool.

------
staysaasy
IMO one of the keys to a platform engineering group's success (alluded to in
this post) is having a mindset in which other engineering teams are the
customer. Once you flip into this mode it becomes a lot more clear how your
platform is really a product, and the platform engineering team fits much more
seamlessly into the overall organization.

~~~
k__
Problem is, the other teams can't go somewhere else if the service sucks.

~~~
zaat
Yes they can. There's a name for that: shadow IT. All of a sudden you realize
that the annoying engineers have stopped annoying you not because they become
better people, but because they got the credit card of some VP and got their
own platform on AWS.

~~~
user5994461
Is it really "shadow IT" if they got approval from the VP with credit card
attached? Seems like they're officially accredited to do the work.

~~~
zaat
Sure is, the VP is VP of Marketing and he puts the expenses under media
research or something, and they put the whole company CRM data into Mongo or
Elastic on AWS with no auth, open to the whole internet to see. And sadly
that's more daily news than fiction.

------
sl1ck731
The author describes Terraform and CloudFormation as "imperative". This
doesn't seem correct to me, although you _can_ force a sort of imperative flow
by manually defining your dependencies in a specific order. I have only a
little experience with Ansible but I would say that is the only major
imperative-ish IaC (at least the way I used it) aside from bash scripts or
working with SDK's directly.

~~~
halbritt
Terraform in and of itself is declarative, but it behaves in an imperative
sort of way with the various backends that it supports.

These shortcomings all manifest themselves in how state is managed. Terraform
state is declaratively described, and it may or may not match the state of the
backend. Once this state drift exists, it becomes difficult to correct.

This is my primary criticism of Terraform and one of the reasons I prefer
Kubernetes. I know it's an apple to orange comparison, but in Kubernetes there
is both declarative configuration and active reconciliation. You have both
current state and desired state and a set of controllers seeking to make them
match. I'd love to see this implemented with Terraform.

~~~
mwarkentin
Terraform attempts to refresh its state from the source of truth (eg aws apis)
before planning. It’s not always possible, but often it should work just fine
even if you’ve modified a resource outside of terraform.

------
sam_lowry_
I had a really bad experience in an organisation possessed by a platform team.
A small number of individuals were rolling change after change that impacted
hundreds of engineers, halving their productivity and halting all development
to a grind once a month.

~~~
fizx
You know how we have that "libraries are better than frameworks" discussion
once a month? It sounds like your platform team was a framework.

Ideally, a platform team should give you reliable, self-service components to
build upon, like databases, caches, rate-limiters, api-gateways, etc.

If they're framework-ish, they're mandating you build your app in exactly
their preferred way in order to receive the benefits.

~~~
k__
ideally 99% of all companies shouldn't even have a platform team, but buy a
platform from professionals.

~~~
user5994461
99% of companies should have a platform team, whose role is to make services
from AWS/Google/Azure available internally and easy to use.

~~~
k__
Yes, or that.

------
spaetzleesser
I sort of work on platform engineering in the form of a test automation
framework used by more than a hundred testers. In general it works fine but
you have to be ready to scale up your platform team at the same speed or even
higher than the users of the platform.

You also have to be very careful allowing the users to expand the platform to
their needs or the platform team will be a permanent bottleneck. I have seen
this in companies where the SAP people had a backlog of several years.

~~~
andendau
There's a good blog on scaling a platform team and how to keep up with growth.
[https://medium.com/adobetech/why-do-organizations-need-a-
pla...](https://medium.com/adobetech/why-do-organizations-need-a-platform-
team-910d79893e0a)

------
aogl
I actually wrote about this months ago at [https://ao.gl/what-it-takes-to-be-
a-platform-engineer-in-202...](https://ao.gl/what-it-takes-to-be-a-platform-
engineer-in-2020/) Interestingly also had quite a bit of traction on HN, which
makes me think many are moving in this direction..

~~~
user5994461
+1 on this. Started to get contacted on linkedin and finding job postings for
platform engineering last year. I have a strong feeling this could be the next
buzzword of the 2020 decade.

------
dutch3000
Platform engineering = Site reliability engineering? That’s how I see it.

~~~
halbritt
People view SRE in many difference ways. If you were to go through the Google
SRE book, there's nothing explicit in there about building "platform" or doing
any infrastructure engineering.

It's is a given that SRE build tooling, but most of the SRE-focused work, as
described in that book is around improving the resilience of a given
application or product. Addressing production readiness, defining SLOs, and
handling incident response.

There are platform specific SRE teams at Google, but there's not much
published about how they get about creating platform.

The book "Seeking SRE" makes it clear that in most places, the notion of
"platform engineering" varies tremendously.

I don't know of any authors that have addressed this explicitly other than
Susan Fowler in "Production Ready Microservices" who writes:

    
    
      Another important part of microservice adoption is the 
      creation of a microservice ecosystem. Typically (or, at 
      least, hopefully), a company running a large monolithic 
      application will have a dedicated infrastructure 
      organization that is responsible for designing, building, 
      and maintaining the infrastructure that the application runs 
      on. When a monolith is split into microservices, the 
      responsibilities of the infrastructure organization for 
      providing a stable platform for microservices to be 
      developed and run on grows drastically in importance. The 
      infrastructure teams must provide microservice teams with 
      stable infrastructure that abstracts away the majority of 
      the complexity of the interactions between microservices.

~~~
SanderKnape
The book Team Topologies discusses Platform Engineering. They consider four
core team types, and the platform team is one of those. Some more information
about those types can be found here: [https://teamtopologies.com/key-concepts-
content/what-are-the...](https://teamtopologies.com/key-concepts-content/what-
are-the-core-team-types-in-team-topologies)

I can definitely recommend this book for a (new) perspective on how platform
teams fit into organizations.

~~~
andendau
Great book highly recommended

