November 30, 2018

Failure and Change: Principles of Reliable Systems

As we construct larger or more complex systems, failure and change are ever-present. We need to accept and even embrace these tensions to build software that works and keeps working.This is a talk on building and operating reliable systems. We will look at how systems fail, particularly in the face of complexity or scale, and build up a set of principles and practices that will help us implement, understand and verify reliable systems.

This talk was presented at YOW! 2018.

[ deck ] [ video ]