
Netflix VP of IT on the Future of Infrastructure - dataisfun
http://www.amplifypartners.com/interviews/netflix-vp-of-it-on-the-future-of-infrastructure/
======
teacup50
> _The notion that something needs to remain on-premise is really an Old World
> way of thinking and feels more like someone wanting control as opposed to
> there being a valid argument._

No, it's the business continuity way of thinking. Outsourcing commodities --
such as servers, virtual or otherwise -- is one thing.

Outsourcing your core operational tools, software, _and all your data_ is
another matter entirely. Preferring SaaS at a company large enough to afford
on-premise solutions is just nonsensical, and I expect it'll either blow up in
his face, or just create a never-ending tax on end users who are constantly
dealing with a mishmash of vendors, accounts, disappearing services, broken
software, and instability.

At scale, stability and continuity is worth more than the opex/capex costs of
internal IT.

~~~
qq66
Netflix has been on AWS for awhile, and I've never once had any stability or
continuity problems with it. Whatever garbage is going on in the background,
they've done a good job of preventing it from becoming a never-ending tax on
end users.

~~~
barkingcat
They built chaos monkey - software agents that randomly go through all their
infrastructure on aws and randomly take things down. Network jitter, computer
hangs, removing drives, and taking down availability zones.

They do that to themselves so that when it actually happens, they are prepared
and the end user (almost) never sees it.

------
numlocked
The jargon and acronyms in this interview are intense. It's pretty clearly an
industry interview so it's my fault that I don't know the phrases, but I'm a
little surprised by how impenetrable it is to me (a software engineer who has
worked in large corp environments).

Anyway care to expand on some of the less Googleable acronyms?

\- MDM/MAM

\- NAC

\- EDW (synonymous with ETL?)

~~~
martinald
MDM/MAM = mobile device management/mobile application management (managing and
provisioning mobile devices and their applications, generally automatically)

NAC = network access control

EDW = enterprise data warehouse (archiving old information while preserving
access)

------
e12e
It'd be great if Netflix (or some other company) manages to do some heavy
lifting in creating a viable, modern, certificate-based authentication and
authorization stack, that's easier to deploy. Essentially an upgraded take on
kerberos (move off shared secrets, perhaps), AFS (I still don't know what a
viable way forward for secure, distributed, locally cacheable network
filesystem is -- maybe DAV+TLS+regular caching?). I suppose LDAP might be fine
as a user/principal/authorization database, but some distribution that uses
internal CA and demands TLS as default would be a good start.

The last "innovation" I'm aware of in this area, is skolelinux/edulinux work
with packaging samba/ldap/kerberos/lts in a easy(ier) to manage package for
Debian:

[https://wiki.debian.org/DebianEdu/Documentation/Wheezy/Archi...](https://wiki.debian.org/DebianEdu/Documentation/Wheezy/Architecture)

~~~
zobzu
kerberos is very good. everyone reinvents kerberos every month. it doesn't
have to be new to be "modern". Kerberos is still modern by today's standards,
in fact.

the problem is having tools that communicate with each others and an easy
setup.

yeah, SAML kinda sucks to use too.. and works like kerberos anyway. OpenID,
Hawk, etc - also in fact work exactly the same.

~~~
eropple
Seriously. At my day job we're investigating better ways to handle single sign
on and authentication and every road leads back to Kerberos.

~~~
e12e
I'm not saying Kerberos isn't good, and obviously it's going to be better than
any "new" system -- after all any new system hasn't seen any real-world
testing. All the fluff around the various (http-centric) SSO-solutions is
partly from wrapping them around SSL/X.509 -- just as IMNHO one of the
problems with setting up (a secure and easily maintained) kerberos deployment
isn't kerberos but LDAP.

As mentioned up-thread MS AD does a great job of enabling in-house CA and
management -- and it's mostly that I want. I want to use certs for auth most
places, and I want it easy! Openssh have shown that public key auth doesn't
have to be hard -- but also doesn't have a very compelling story around
managing access. The new cert-system might be an improvement -- but it
absolutely needs some infrastructure around it to be easy to deploy (and
verify).

~~~
zobzu
in fact openssh has support for full-blown certificates but its also a little
more painful. what makes ssh easy to use is that it doesn't have any central
trust authority by default. you get a fingerprint and you trust it.

if it changes, it warns you.. but in most cases you're going to know why it
changed or just accept the change anyway (which is a problem when you admin
10000 servers of course as the warning might be a real issue)

central trust/revocation is still an issue everywhere to this day, i think.
both technically ("my client trusts this, but do i?") and from the useability
pov.

~~~
e12e
> openssh has support for full-blown certificates

Well, yes and no. Do you mean the new cert stuff that's in standard openssh?
Which has stuff like:

    
    
        The marker is optional, but if it is present then it must be one of “@cert-authority”, to indicate that the
         line contains a certification authority (CA) key, or “@revoked”, to indicate that the key contained on the line
         is revoked and must not ever be accepted.  Only one marker should be used on a key line.
    

While certainly simple, it doesn't strike me as very manageable.

Or did you mean the x509 patch?

[http://roumenpetrov.info/openssh/](http://roumenpetrov.info/openssh/)

------
purephase
I thought the 2014 Technology Roadmap [1] was an interesting read. For an
organization as "young" as Netflix, I was surprised by the technology debts
that they've accumulated and the aggressive tone that they've set to
transition.

I think it's amazing the decisions that get made with explosive growth/hiring
that end-up on roadmaps that read similarly to organizations that have been
around much longer.

There's no criticism here. I think Netflix is an amazing company and it is the
this sort of strategic vision (and the openess of both it and the organization
overall) that reminds me that we're all on this rocky ship together and it's
amazing that any of it works sometimes.

[1] [http://www.slideshare.net/mdkail/it-ops-2014-technology-
road...](http://www.slideshare.net/mdkail/it-ops-2014-technology-roadmap)

------
obblekk
Forgive my ignorance. Is `IT` the same as `engineering` at other companies, or
is this something else?

~~~
bri3d
"IT" is too vaguely defined to mean much anymore, but based on this interview
I suspect it's internal infrastructure at Netflix. Stuff like staff PCs,
internal data warehouses, sales, finance, and marketing software support, WiFi
APs, routers, keeping the backoffice servers and network up, ensuring reliable
WAN and LAN connectivity so engineers can reach production securely, intrusion
detection and analysis, and so on.

Generally in smaller software companies I hear R+D and consumer-facing
applications referred to as "engineering" with external-facing infrastructure
(like the production datacenter) referred to as "operations," with "IT" being
reserved for this internal backoffice kind of stuff.

In other places, especially larger corporations, I've often heard everything
having to do with a computer lumped in as "IT."

~~~
xivzgrev
Yea I was surprised. When I saw the title I thought infrastructure meant
network infrastructure, which seems like a white-hot area of innovation given
how much internet traffic they consume. But no, this interview was on internal
IT.

------
cottonseed
One thing that stuck out to me:

> We are implementing “certificate-based authentication” instead of the
> standard username/password auth against Active Directory.

I wish we were all doing this. How long is it going to take to get a usable
certificate-based client/user authentication mechanism on the web?

edit: Also see e12e's comment.

------
qthrul
TLDR: The approach we take for IT works (for us at this point in time in the
scope defined as IT by me and/or our internal customers).

Netflix talks generally can be fascinating and inspiring. However, when
considering IT it's also important to consider the charter and challenges of
Netflix IT.

i.e. it's no more valid or invalid that the talks of how IT is delivered in
so-called build vs. broker models in other companies in other industries
[http://dilbert.com/strips/comic/2013-07-05/](http://dilbert.com/strips/comic/2013-07-05/)

------
zobzu
Reading the slides make me think this is full of nothing :| How is 802.11ac
speed making things "more cloud"? Because you get slightly more bandwidth
-maybe- if you have a new laptop and also you dont have everyone using it? I
don't get it.

Requiring VPN everywhere, how is that cloudy?

Finally, using stuff like AWS is nice, but unless they have a specific
contract (which they may since they advertise them a lot), its a LOT more
expensive when you start having a lot of processing (ie big companies like
netflix)

~~~
cwp
Well, if anybody could get a good deal from AWS, it's Netflix.

------
vespaceballs6
I really wanted to share this article with my friends, but it was so filled
with buzzwords that even my dev friends wouldn't ascertain much.

Keep in mind I'm an idiot, and I have idiot friends.

~~~
dataisfun
haha. Well, I doubt you're an idiot. Buzzwords serve a purpose insofar as
they're a good shorthand for defining categories of product offerings. E.g.,
ETL, Data Warehouse, Mobile Device Management, etc. all have pretty well
understood parameters within which the vendors, buyers, analysts, etc.
operate.

------
nessup
Why was there no discussion of the ethics of the Comcast deal?

~~~
cdcarter
Because this was a conversation about internal infrastructures?

------
welder
> zero-trust network architecture

This makes me think of [http://meldium.com](http://meldium.com)

