Netflix Compute Platform and Titus Public Links
The Netflix Compute Platform team focuses on the compute primitives required to run both elastic services and fully featured batch. The Compute Platform is part of Netflix’s Cloud Infrastructure Engineering which has a mission to focus on increasing fleet-wide agility, efficiency, and reliability of our infrastructure while reducing the operational burden on our engineers.
Titus - Netflix’s container management platform
Titus includes a sophisticated scheduler based on Apache Mesos that handles not only advanced resource scheduling, but also key operational aspects to manage clusters of over 1000’s of nodes. Titus also has a container runtime based on top of Docker that provides advanced security, cpu, disk, memory and network isolation and features such as GPU’s and NFS storage. Titus currently launches over three million containers per week across three regions in the AWS public cloud across hundreds of Netflix workloads.
Compute Platform Links
We will be looking soon for engineers who are as aware of Linux system programming as multi-tenancy in containers. If that is of interest to you @aspyker’s twitter DMs are open.
Titus Links
Presentation: Netflix’s Kubernetes Journey
Blog: Evolving Container Security with Linux User Namespaces
Blog: Predictive CPU Isolation of containers at Netflix
Blog: Autoscaling Production Services on Titus
Presentation: AWS re:invent 2018: Another Week, Another Million Containers on Amazon EC2 (CMP376) (video, slides)
Presentation: QCon SF 2018 talk on what we have learned operating Titus in production for multiple years (video, slides)
Presentation: NetflixOSS Meetup — Titus Open Source (video, slides)
Presentation: AWS re:invent 2017: Elastic Load Balancing Deep Dive and Best Practices (NET402). This talk focused on the Application Load Balancer (ALB) support recently added to Titus.
Blog: Updates on Netflix’s Container Management Platform
Publication: ACM Queue Article: Titus: Introducing Containers to the Netflix Cloud
Presentation: A Series of Unfortunate Container Events (video, slides) — QCon NYC 17. This talk focuses of the lessons learned operationally running a container management platform at scale for over a year.
Blog: The Evolution of Container Usage at Netflix — Netflix Techblog
Presentation: Netflix Container Scheduling, Execution, and integration with AWS (video, slides) — AWS re:Invent 2016. This talk focuses on the architecture of Titus and specifically on how we integrated it with key AWS EC2 features such as VPC, security groups, and IAM.
Presentation: Scheduling a Fuller House: Container Management at Netflix (video, slides) — QCon NYC 16. This talk focuses on our scheduling technology and the open source Fenzo library.
Graphics: Titus logos (logo, head, head square)
Last updated: 2019–06–18