© 2020 Strange Loop
In-memory caching is not a single problem but many. At Twitter, one cache client must store myriad tiny objects, another requires fast updates of large values, and a third wants to replace in-process cache in Java with an equally fast cache outside the JVM to avoid garbage collection. Optimizing for one of these use cases takes carefully choosing data structures, memory layout, eviction strategy, and transport medium. Existing solutions often bake details of these low-level components into their overall designs, and it can be messy and costly, or sometimes impossible, to introduce alternative parts to an established codebase.
After years maintaining forks of the popular Memcached and Redis projects, both relatively monolithic designs, Twitter is starting with a clean slate, a new modular cache framework built out of small composable pieces. The use of common interfaces regulates and limits the surface area of individual components, supporting radically different implementations behind the interface. Production critical features such as managed resources, logging, and statistical measurements are core priorities rather than a bolted-on afterthought.
We generate binaries that offer highly predictable tail latencies even at near 100% CPU utilization, a fixed memory footprint (no more out-of-memory shutdown in a container environment), and readable logs and metrics that are cheap to collect and readily available when you need them for debugging. Moreover, it is now easy to implement new cache backends with small amount of effort: we added support for memcached and redis protocols, and implemented a new storage engine backed by cuckoo hashing that achieves a 90% reduction in metadata overhead, each in about 1000 LOC on top of the core library. Future users with substantially different needs can easily swap out parts while avoiding duplicating the work already done.
Through this talk, I also want to share the lessons we learned from years of operating large-scale cache clusters in production. Building operations-friendly systems is rarely about a single clever trick; more important is for engineers and architects to acquire and practice insight, discipline and common sense.
We plan to open source the code and documentation about this project in time for the talk.
Yao Yue is a member of Twitter Platform. She is the tech lead and manager of the Cache Team, and has spent the past few years maintaining, designing and implementing various cache-related projects. She maintains several open source projects, including Twemcache and Twemproxy. Yao has a particular interest in building and simplifying systems deployed at-scale. Over the years she has spent much time studying and playing with computer networks, various distributed systems, databases, and parallel programming. Yao has a BS in Math and Physics and a MS in Computer Science. Before joining Twitter, she interned at Facebook, NVIDIA and IBM Research. In her spare time, Yao has been introducing programming to middle/high school girls via drawing and games. She is also a baker.