Skip to main content
  1. Tags/

Infrastructure

Case Study: Fleet-Scale Kernel Automation at Twitter

At Twitter, I was responsible for kernel updates across 5,000+ production servers. Updating a kernel is risky on one machine. Doing it across a fleet, without downtime, without data loss, and without breaking the services that millions of people depend on, is a different problem entirely. The Problem # Twitter’s production infrastructure ran on thousands of bare-metal servers across multiple data centers. Each server ran a Linux kernel that needed regular updates for security patches, performance improvements, and hardware compatibility.