Strange Loop

September 26-28 2018


Peabody Opera House


St. Louis, MO

Failing (and recovering) asynchronously: a saga

Asynchronous, event-driven programming is quickly becoming ubiquitous, and it comes in many forms: promises, futures, channels, actors, and more. These techniques help us build robust high-performance applications. However, what do we do when something fails? "Let it crash" is perfectly acceptable for many situations. However, what about the situation when one part of a multi-step process fails, and you need to undo the side effects of earlier steps? Are there patterns for reasoning about and resolving this problem?

In 1986, Hector Garcia-Molina and Kenneth Salem published "Sagas", a paper describing how to improve the performance of long-lived transactions in DBMS systems by breaking them up into "sagas", a collection of smaller transactions which can be undone in the case of a failure. It is possible to adapt the saga pattern to solve our asynchronous recovery pattern. In particular, we will examine how to use it in both reactive stream-based systems and future-based systems with examples in Scala.

Daniel Solano Gómez

Daniel is a long-time software developer and member of the Clojure community. He enjoys creating high-quality software and contributing to open source projects.