A presentation at Seattle DevOps Meetup by Fen Aldrich
Everything is a little bit broken all of the time. Sometimes, like recently, as in right now, the core truths and assumptions about the universe seem to be shifted out from underneath us without warning. And yet, Netflix streams movies, GitHub serves code, we all keep going and working from home and surviving. These systems, both technical and social, are resilient. And through their resilience they are able to handle all those one-in-a-million occurrences that crop up nine-times-in-ten. But what does it mean to be resilient? And further, how can we recognize and grow this resilience to better deal with the systemic surprises we’re dealing with right now and in the future? What does a resilient team look like, and how do you foster that?
This talk will answer:
The following resources were mentioned during the presentation or are useful additional information.
white paper from David D. Woods
Uprising is an excellent overview of the resilience of a social uprising, especially episode two: We Do This Every Night which examines how the protests have evolved since they first started after George Floyd’s murder by police in Minneapolis.
Video from Dr. Cook at REdeploy 2019
“I don’t care about all this human interaction bit, I just want my applications to be more robust!” Ok, fine, then there are a million and one options out there. Check out OpenShift as one of them and see what it fills in around your container/Kubernetes ecosystem.
“I care about team practices and behavior, but I want practical technical steps to take” Ok, sure. Check out how SRE practice approaches the concepts of reliable software. If you’re not thinking about SLOs and SLIs yet, this will make some technical progress for you.
Further SRE-style reading.
Background on Safety Differently and how empowering employees at the sharp end and removing beurocratic top-down policies improved safety overall.
Seriously every talk from this conference was on point one way or another. Some are more practical, some are more philosophical, all of them are about resilience and worth taking the time to watch and think about. Some more than once.
J. Paul Reed’s talk from Failover conf earlier this year, 2020
The State of DevOps 2019 Report
Matt Stratton’s talk on collaborating in teams with emphasis on Psychological Safety and practical ways to enable that in our teams.
Greater Than Code Podcast episode regarding resilience
The Worst Year Ever podcast episode specifically around community action in the absence of government resources. You should listen to a number of episodes on this one, though.
My blog post reflecting on the costs and alternatives to borrowing work time from your home time in this moment of crisis. Some shameless self promotion, but words I truly want to share and couldn’t fit in the talk.
Here’s what was said about this presentation on Twitter.