Strange Loop

2009 - 2023

/

St. Louis, MO

In-Memory Data Grids - Distributed, Scalable Data Stores

This is a talk about linear scalability, what it is and one way we achieve it. As the data sets used in our software grows ever larger we need to keep access times low and avoid the slowdown typically associated with those huge data sets. We know about different database topologies such as master-master, master-slave and clustering. At some point, typically sooner than we prefer, these topologies reach their limitations. Adding additional hardware yields diminishing returns and the only way to add more capacity is a costly and risky migration to faster, exponentially more expensive hardware.


This is not a talk bashing databases. Instead it's about a new approach to data stores. An in-memory data grid stores some or all of your data set in memory distributed across one to 2000+ computers you already have. We'll talk about what this means and the trade-offs we make by storing data in memory vs. disk. What happens if all our data doesn't fit in memory? How do clients access this data now that it's out of the database? These are the most basic issues to know about when dealing with an in-memory data grid.


Beyond the basics IMDGs act as the primary data store or a shock absorber in front of your database. Data grids and databases are not mutually exclusive and work quite well together. Each one addresses different needs for different customers.


We'll talk about where using a data grid is and is not appropriate. We should also look at how IMDGs provide scalability and redundancy across a data centers and around the world. One of the most interesting things we'll see is an example of operating on data in the grid. Co-locating data and algorithms lets us reach performance hundreds or thousands of times faster than dragging data between client and server.

Anthony Chaves

Anthony Chaves

Anthony writes software for customers of all sizes. He likes building scalable, robust software. Customers have thrown all kinds of different development environments at him: Java, C, Rails, mobile device platforms - but no .NET (yet).


Anthony particularly likes user/device authentication problems and applied scalability practices. Cloud-computing buzzword bingo doesn't fly with him. Anthony wrote Getting Started with WebSphere eXtreme Scale, a tutorial-style book about IBM's in-memory data grid product. He started the Boston Scalability User Group in 2007.