A little bit self-serving...at PagerDuty, we have been doing these monthly "video AMA's", where the community asks questions of a guest in advance, and then we get the guest to answer them on video. It's sorta like a podcast, but not.
For March, our guest was my pal Jeff Smith. We talked about how to make on-call more humane, and why that even matters.
John Allspaw digs into some of the fallacies around MTTR and similar metrics. The metaphor used that I like is this one:
This bunch of grapes is 5.6 inches across at its widest point. The mean diameter of these grapes is 0.73 inches. The median color is 195/211/86 (RGB). How do they taste?
The point is, we get very little insight about incidents by looking at most of our common metrics, because incidents all have their unique factors.
I've been on a bit of a road show giving a talk called "Incidents & Accidents" (all props to Bridget Kromhout for the title).
It was recorded recently at the DevOps Minneapolis Meetup, where I presented it to a packed house. If you'd like to learn a little bit about the philosophy of how we look at Incident Response here at PagerDuty, this talk is for you.