© 2020 Strange Loop
Failure in a modern distributed system is a complicated affair. Many distributed systems are deployed into production with multiple bugs and can limp along on one leg for months due to the self-healing properties of their highly available architecture. Be that as it may, apply enough load and eventually things will cease to work when you need them the most. This talk presents a taxonomy of distributed systems failures and bugs in the wild, as seen through the lens of the network. By classifying the failures we find, we can come closer to being able to proactively detect them before they develop into full blown outages.
Cliff Moon is Founder and Chief Technical Officer at Boundary. Prior to Boundary, Cliff was a lead engineer for Powerset (natural language search engine acquired by Microsoft) where he was instrumental in the design, implementation, launch, and operation of many of the company's production services. Cliff is an active contributor to open source projects, developing the first open-source implementation of Amazon Dynamo and originating the Dynamo Framework. Cliff is an active and well-regarded member of the NoSQL, Scala, and Erlang communities.