Hacker News new | past | comments | ask | show | jobs | submit login
Netflix: Container Scheduling, Execution and Integration with AWS (2016) [video] (youtube.com)
44 points by fullung on June 10, 2017 | hide | past | favorite | 19 comments

I get scared every time a big company releases their framework, fearing that it will become the new standard to build even the smallest of websites.

I've used React extensively and you're almost forced to use it if you want to be hired anywhere (I'm talking front-end development).

The trend nowadays seems to be to adopt any framework that huge companies use thinking that it'll be good for you. Honestly, do we really need all the complexity of the what the biggest social network in the world uses, to build a project 1/1000 the size?

I'm sticking to PHP/MySQL because I'm productive with them, I build small apps used by 10-20 employees at the most, I don't need NoSQL to scale horizontally to million of users and don't really need Node since it doesn't really matter if it's real time and PHP is found on every hosting platform—including the one all my clients have been using for years. I currently use React but all I do is CRUD operations and some visual effects, I'm thinking about moving to web components to lose Webpack, Redux, Babel, and a 200MB installation to save 100-200 orders/week on a database.

Personally, I really dislike PHP, but I agree with your sentiment. You do what you have to do to solve your customer's problems, using tools you know will work.

At work I have written a couple of CGI scripts in Perl, because that was the easiest, fastest way to get a working solution to a relatively small problem. I know CGI is totally '90s, and Perl has not been hip for at least a decade, but damn it, it gets the job done quickly without fuss.

I kind of suspect that with PHP the situation is comparable to Perl - you have a well-tuned, mature interpreter, a rich library ecosystem for almost any task you can think of, and a community of experienced developers.


PHP isn't perfect at all, with all its inconsistencies and counterintuitive constructs, but it works great. Lots of libraries you can install with Composer, most people know at least some PHP.

On top of that, unless WordPress isn't rewritten in a new language (which won't happen) PHP won't go anywhere.

Perl is fine, too. We solve problems for clients who pay us to get the job done, not be brag about how we use the latest, coolest technology.

Why should they release it?

This site was mostly created as a joke after I watched a couple of @aspyker's great presentations about Titus and realized that we have a huge amount of exciting work to do to get our own container platform on AWS to the same level.

Netflix should do what makes sense for them. I understand that doing a good job of releasing a very complex piece of software as open source is hard and takes a lot of effort.

Could be a more useful site if it had been on a domain like http://www.isitopensourceyet.org/netflix/titus or similar. (note domain does not exists. At the time of writing...)

I'll happily create a redirect once you build that for us. :)

I'm curious, why are you rolling your own?

Not rolling our own exactly: running on ECS now and busy evaluating Kubernetes. These are still quite blunt instruments though.

At some point, as a company grows, it's not just about running some containers. You want batch jobs, complex placement constraints, scheduled tasks, etc., etc.

If you haven't watched through https://youtu.be/4OLlKGT7aVQ, I can recommend it. aspyker does a great job of enumerating many of their complex use-cases.

The Titus team at Netflix is building this on top of EC2/Mesos and ECS.

The rest of us still need to do a bit of work to find/adapt/build our own.

> The rest of us still need to do a bit of work to find/adapt/build our own.

I usually recommend people just take an existing PaaS and run with it. I'm wildly biased in this regard because I work on one as my day job.

Also because rolling your own is hard. Really really hard. We have literally hundreds of engineers in dozens of teams, and that doesn't count the engineers our partners have assigned. We've been cranking for years. It's just a lot of hard engineering for stuff you don't see until it arrives, and arrives unpleasantly.

Kubernetes makes it a lot easier, especially if you need lots of flexibility, but it still leaves you to assemble a lot of parts and management is left very much as an exercise for the reader.

If you build on Kubernetes, may I recommend Kubo[0] to manage it? We built it with Google, but because it's based on BOSH you can deploy to AWS, GCP, Azure, OpenStack, vSphere and even bare metal with RackHD. BOSH by itself is a massive force multiplier for operations.

[0] https://github.com/pivotal-cf-experimental/kubo-deployment

Note: I'm the one in the video, an original engineer on Titus, and now the product/hiring manager of the Titus team at Netflix. Also, I do not actively monitor hacker news, so it is unlikely that I'll follow-on to this conversation (@aspyker on Twitter is an easier way to reach me).

Is Titus Open Source Yet? No.


Reason #1: When we created Titus, we married the back end of a container execution engine (Titan) with the front end of a scheduling system for stream processing (Mantis). We have a goal at the time to stabilize existing usage at Netflix. We didn't have a goal of removing all code from both sides that wasn't needed after the marriage. We wanted to clean this up before releasing in open source as we would have to spend more time telling people what to ignore in the codebase that is no longer used in Titus. We also didn't have time to create proper interfaces that separate Titus from other important operational systems at Netflix. We have recently been re-implementing the engine to have clean interfaces with minimal code.

Reason #2: When Titus was created it had a user API that grew during initial usage in batch applications. Given the number of clients using this API, it was hard to change it without impacting users. Compounding this, we added service application support without thinking much about making the API well designed. We have recently been working on a new API that was designed to handle both use cases. This API is currently being adjusted based on internal feedback.

Reason #3: The space of container management platforms is truly a crowded space. There are many other great options out there. When we eventually open source Titus, we hope it is clear why Titus is unique in this space specifically for "all-in" Amazon EC2 customers, for NetflixOSS cloud platform adopters (those who use Spinnaker, Eureka, etc.), and supporting the level of complexity we do (VPC IP per container, GPU's, complex fleet and capacity management, etc.). Finally, showing how at scale production battle hardened Titus is. We realize even with these differences, we'll spend a consider amount of time justifying these points to those who look at the space more generally. We would prefer to not spend any time "marketing" our differences.

Reason #4: Our main focus at Netflix is to service Netflix developers providing the easiest to use and most reliable service. Open source is something that is valuable to us with regards to potential external contributions as well hiring and retention of the best engineering talent in the world. This value, while significant, isn't as important as providing our internal Netflix value. As a part time responsibility I help open source at Netflix. Therefore, I am also aware that doing open source well (like NetflixOSS Spinnaker as a shining example) requires more investment for teams than posting the code to github. Our team is small today and we do not believe we can make Titus open source as great as we'd want it to be. That said, we're growing (did I mention I'm the hiring manager - https://jobs.netflix.com/jobs/862432) every day and can see the value of changing this equation.

I am truly honored that someone would great a whole website to ask for Titus. https://www.istitusopensourceyet.com/ made me laugh last night and decide to write this response today. We appreciate your excitement for our container management technology and Netflix open source. As we continue to evaluate this space, we'd love to find other AWS users who are also using the Netflix cloud platform and Spinnaker to consider partnership.

Seems there are plans in the works via clouddriver: https://github.com/spinnaker/clouddriver/search?utf8=%E2%9C%...

I'm sure they will eventually, but it probably takes a big amount of effort to prep and polish something for oss consumption. Thankful for all the netflix oss tools that are already in the wild!

There are other container schedulers that run on AWS, after all: Mesos, Kubernetes, Diego, Nomad and probably a dozen others.

Several of these are built into platforms: Mesos in DCOS; Kubernetes appears in Deis, Rancher and OpenShift; Diego appears in Cloud Foundry (for which it was created to replace a previous scheduler).

Disclosure: I work for Pivotal, we contribute to Cloud Foundry and our Spring folk do a bunch of work around Netflix OSS.

This thread has some of the motivation for why Titus is exciting in my opinion: https://news.ycombinator.com/item?id=14526452

Kubernetes, Mesos/DC/OS and the various other options are all valid solutions if you need to run in many diverse environments. Titus runs (ran?) on top of Mesos before they built an ECS integration.

That being said, there's interesting benefits to the approach of not trying to abstract the IaaS and deeply integrating with one or two clouds instead. You get to leverage much more of the cloud(s) you've decided to adopt if you don't immediately try to abstract them away.

> That being said, there's interesting benefits to the approach of not trying to abstract the IaaS and deeply integrating with one or two clouds instead.

I actually thought about mentioning Convox in this vein, as it is intended to be for AWS only.

> You get to leverage much more of the cloud(s) you've decided to adopt if you don't immediately try to abstract them away.

It depends on the services. For the bulk services (blobstores, VMs, networking, queues, databases etc) the big 3 support, you can usually have some degree of migration without too much pain, if you're using one of the platforms correctly.

For example, on migrating a Cloud Foundry platform from AWS to GCP, you can switch your service brokers from AWS to GCP. Apps that ask for a database won't, in general, know or care who's providing it.

Another thing is that a fair amount of the specialising services now have portable alternatives. A good platform can integrate these on an equal basis. At the moment there's an effort to extract the Cloud Foundry service broker model so that it can also be used by other platforms[0], so this will become even easier with time.

Where AWS still have a sustained lead is on sheer breadth of offerings. But my hunch is that Google know this is the first real opportunity to produce a second revenue stream comparable with advertising. They're going to brute force their way to near-parity with a lot of engineering and money.

[0] https://www.openservicebrokerapi.org/

IAM Roles for Tasks is one of the better examples on AWS: http://docs.aws.amazon.com/AmazonECS/latest/developerguide/t...

https://www.slideshare.net/mobile/aspyker/netflix-and-contai... says: "Deeply support AWS (not trying to abstract IaaS)". I'm wondering how much value Titus has outside Netflix as compared to say Hashicorp's Nomad?

So, disclaimer: I work on Titus.

We have a lot of odd integration points with Netlfix OSS and AWS.

It's funny, but providing simple things like a metadata service is valuable. Our internal Service Discovery relies on it (which is OSS). Also, VPC-integration isn't in Nomad, and again, the Netflix ecosystem expects it. These little integrations make your container appear more EC2 VM/Netflixy, and help leverage the ecosystem.

That makes a lot of sense.

Releasing this even without abstracted integration sounds reasonable to me - this is a system you built primarily for internal use, and you're engineering strictly atop AWS.

If the community wants support for other providers, someone else can come along and re-integrate with another provider, and maybe see if/where it's possible to collaborate with you guys on small changes that allow for easy maintenance for everyone.

Applications are open for YC Summer 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact