Sticking Together…together while Staying Apart…apart Resilience in the time of global pandemic Aaron Aldrich Developer Advocate @CrayZeigh

Everythingʼs a little bit broken all of the time… but it keeps working anyway

Million-to-one chances… crop up nine times out of ten. ̶(GNU) Sir Terry Pratchet

Resilience @CrayZeigh

Resilience Graceful Extensibility Rebound Robustness Sustained Adaptability @CrayZeigh

Rebound Return to “normal” after a surprise or traumatic incident. Work done ahead of time. @CrayZeigh

Robustness The ability to withstand and absorb well-modeled disturbances. Knownknowns @CrayZeigh

Graceful Extensibility The ability to stretch with challenges to operational boundaries. Opposed to brittleness. @CrayZeigh

Sustained Adaptability Recognizing and managing adaptive capabilities over long timescales @CrayZeigh

@CrayZeigh

Bone • Continuously created and destroyed • Reconstruction directed by mechanical strain • Process directed by signals through layered networks at cell-level @CrayZeigh

@CrayZeigh

@CrayZeigh

Rebound Graceful Extensibility Robustness @CrayZeigh

Socio-Technical Systems @CrayZeigh

Conway’s Law Designed systems represent an organizationʼs communication structure @CrayZeigh

@CrayZeigh

Blunt end Removed from experience, upstream decision makers Sharp end Closest to the work, practitioners @CrayZeigh

• Constantly building and destroying systems • Strong signaling • Improve systems based on strain Sharp end • Will do so naturally given ownership Closest to the work, practitioners @CrayZeigh

Teams that do well dealing with impact [surprises/incidents] are those that have a strong common ground ̶J. Paul Reed (@jpaulreed), Failover Conf

If we want to improve a teamʼs resilience, we must build a strong common ground ̶Me, Just now.

Common Ground • • • • Basic Compact Goal Alignment/ Commitment Inter-predictability Sustain & Repair @CrayZeigh

Building Common Ground • • • • Blameless Postmortems Chaos Engineering Game Days Modeling Vulnerability @CrayZeigh

@CrayZeigh

@CrayZeigh

@CrayZeigh

Resilience is about creating the conditions that maximize everyoneʼs potential ̶Rein Hendrichs, >Code Podcast, 174: Resilience

@CrayZeigh

@CrayZeigh

What happens when governments fail? @CrayZeigh

It’s left to us @CrayZeigh

Community Building is Resilience Engineering ̶Me again, just now again.

Strong Communities • • • • • Diverse High Trust & Safety Sustain & Repair Inter-predictability Loosely Coupled, layered networks @CrayZeigh

@CrayZeigh

@CrayZeigh

https://bit.ly/2Ym7Tp9 @CrayZeigh

@CrayZeigh

@CrayZeigh

@CrayZeigh

@CrayZeigh

Enable potential and get out of the way @CrayZeigh

Slides & Resources speaking.crayzeigh.com OSMIhelp.org Aaron Aldrich Developer Advocate EmotionalAPI.com devopsdays.org @CrayZeigh

I love you Be safe out there Weʼre all in this together @CrayZeigh

Further Reading/Watching/Listening Four concepts for resilience and the implications for the future of resilience engineering - David Woods https://bit.ly/3bITTdc The Marvelous Resilience of Bone - Dr. Richard Cook, REdeploy 2019 https://www.youtube.com/watch?v=8LbePBiOvZ4 Greater Than Code, 174: Resilience https://www.greaterthancode.com/resilience The Worst Year Ever, How to Save your Community When The Government Fails https://ihr.fm/3eVNFbI @CrayZeigh

Further Reading/Watching/Listening Behind Human Error(2nd Edition) - Woods, Dekker, Cook, Johannessen, Carter The Woolworths Experiment https://safetydifferently.com/the-woolworths-experiment/ The Field Guide to Understanding Human Error - Sydney Dekker Literally every video from REdeploy: https://www.youtube.com/channel/UCHbJcI6KfyxflRqdv26b3Qw On Borrowing From Yourself - Aaron Aldrich https://dev.to/crayzeigh/a-reflection-on-borrowing-from-yourself-3jhf @CrayZeigh