Netflix Compute Platform and Titus Public Links

Andrew Spyker
2 min readJun 19, 2019

--

The Netflix Compute Platform team focuses on the compute primitives required to run both elastic services and fully featured batch. The Compute Platform is part of Netflix’s Cloud Infrastructure Engineering which has a mission to focus on increasing fleet-wide agility, efficiency, and reliability of our infrastructure while reducing the operational burden on our engineers.

Titus - Netflix’s container management platform

Titus includes a sophisticated scheduler based on Apache Mesos that handles not only advanced resource scheduling, but also key operational aspects to manage clusters of over 1000’s of nodes. Titus also has a container runtime based on top of Docker that provides advanced security, cpu, disk, memory and network isolation and features such as GPU’s and NFS storage. Titus currently launches over three million containers per week across three regions in the AWS public cloud across hundreds of Netflix workloads.

Compute Platform Links

We will be looking soon for engineers who are as aware of Linux system programming as multi-tenancy in containers. If that is of interest to you @aspyker’s twitter DMs are open.

Titus Links

Presentation: Netflix’s Kubernetes Journey

Blog: Evolving Container Security with Linux User Namespaces

Blog: Predictive CPU Isolation of containers at Netflix

Blog: Autoscaling Production Services on Titus

Presentation: AWS re:invent 2018: Another Week, Another Million Containers on Amazon EC2 (CMP376) (video, slides)

Presentation: QCon SF 2018 talk on what we have learned operating Titus in production for multiple years (video, slides)

Presentation: NetflixOSS Meetup — Titus Open Source (video, slides)

Presentation: AWS re:invent 2017: Elastic Load Balancing Deep Dive and Best Practices (NET402). This talk focused on the Application Load Balancer (ALB) support recently added to Titus.

Blog: Updates on Netflix’s Container Management Platform

Publication: ACM Queue Article: Titus: Introducing Containers to the Netflix Cloud

Presentation: A Series of Unfortunate Container Events (video, slides) — QCon NYC 17. This talk focuses of the lessons learned operationally running a container management platform at scale for over a year.

Blog: The Evolution of Container Usage at Netflix — Netflix Techblog

Presentation: Netflix Container Scheduling, Execution, and integration with AWS (video, slides) — AWS re:Invent 2016. This talk focuses on the architecture of Titus and specifically on how we integrated it with key AWS EC2 features such as VPC, security groups, and IAM.

Presentation: Scheduling a Fuller House: Container Management at Netflix (video, slides) — QCon NYC 16. This talk focuses on our scheduling technology and the open source Fenzo library.

Graphics: Titus logos (logo, head, head square)

Last updated: 2019–06–18

--

--

Andrew Spyker
Andrew Spyker

Written by Andrew Spyker

Engineering Manager, Netflix Container Platform

No responses yet