Talks

Durability and Crash Recovery for Distributed In-Memory Storage

Ryan Stutsman - Stanford University

Room 4172 A.V. Williams Building (AVW)

Tuesday, February 19, 2013, 11:00 am-12:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

In this talk I will discuss my work on RAMCloud and its novel fast crash

recovery system. RAMCloud is a datacenter storage system that stores all data

in DRAM. Rather than replicating in DRAM for redundancy, it provides

inexpensive durability and availability by recovering quickly after crashes.

RAMCloud scatters backup data across thousands of disks, and it harnesses

hundreds of servers in parallel to reconstruct lost data. The system uses a

log-structured approach for all its data, in DRAM as well as on disk; this

provides high performance both during normal operation and during recovery.

RAMCloud employs randomized techniques to manage the system in a scalable and decentralized fashion. In a 60-node cluster, RAMCloud recovers 35 GB of data

from a failed server in 1.6 seconds. Measurements suggest that the approach

will scale to recover larger memory sizes in less time with larger clusters.

This talk is organized by Adelaide Findlay