Strange Loop

Predictive Load-Balancing: Unfair but Faster & more Robust

Traditional load-balancing strategies like round-robin, random, or least-loaded work well for naive cases but are sub-optimal in the presence of unevenly loaded servers. For Internet-scale services, better load balancing strategies are crucial to enabling highly available and low latency experiences for customers. Building on technologies and experiments from Netflix, we will present the theory and results of a novel approach called "predictive load balancing".

Our approach includes an efficient way to compute latency statistics of servers over a moving window. We combined concepts from game theory (utility function) and queuing theory (Erlang's instantaneous traffic) and stats from servers to be able to predict latency and react before a server becomes latent. This removes the tail latency issue and mitigates the impact of background tasks like garbage collection.

Beyond that, this innovative approach allows balancing the load across sets of heterogeneous servers/containers without noticeably affecting the perceived latency distribution. We are in the early days of this theory, but this could drastically change the way we design large distributed systems.

Steve Gury

Steve Gury


Steve Gury works on the Edge-platform team at Netflix, responsible for building the platform of API/Edge, one of the critical pieces of our infrastructure. Before Netflix, he spent four years at Twitter where he was an engineer, later tech-lead of the team responsible for the Finagle library. Steve enjoys using new approaches and designing new algorithms to solve distributed system limitation. He received two MS in Computer Science and Control Theory from École des Mines, France.