Meet to review learnings
The incident review meeting is a facilitated opportunity to discuss, for the first time and as a group, how things happened, what unfolded, what was surprising, themes identified, and what is unclear.
This meeting itself is data that will be used for the incident review report. Everyone that participated or was impacted by the incident should have a calendar invite for this and anyone else should be able to request an invite.
Preparing for the meeting
To help the meeting go smoothly and to maximize learning in the time you’ll have, you’ll want to do a little bit of prep-work.
First, ensure any documents, screenshots, or materials you will use during the meeting have been reviewed by key participants and their feedback has been acknowledged and integrated.
Identify participants you’d like to speak with during the meeting. Remember that your role is to facilitate the sharing of knowledge and expertise among the group—use the experts themselves to describe what you learned. Reach out to them before the meeting and ask if you can call on them to explain something or describe their experience. Try not to catch people by surprise in the meeting, this can lead to defensiveness and limit learning.
Now you can lay out your agenda, circulate materials to all participants with sufficient time to review. This will allow everyone to come prepared, understanding how the meeting will flow and get a preview of what they’ll see.
After the meeting, you can generate a brief three-to-four-question survey to circulate in order to capture the participants’ experiences and suggestions for improvement.
Who should come to this meeting?
It’s important for those who were involved in the incident to attend so they can share their perspective and recount their experiences where possible. Schedule the meeting to include as many key persons as you can.
That doesn’t mean that those are the only people who should attend, though. You can also invite other participants; for example:
- Dependent service teams
- Engineers from related (but unimpacted) parts of the business
- Impacted users
- Customer success/advocate roles
- Key stakeholders
Ultimately, you should invite whoever is interested! Multiple diverse perspectives are crucial to making the learning review as comprehensive as possible.
Folks attending an incident review for the first time may be unsure of their role. As you will be the facilitator, you’re in a unique position to be able to guide them! You should remind participants to be ready to:
- Communicate what happened from their point of view, how they experienced the incident, and how it impacted them.
- Listen to other parties; their experiences may be different from one another’s. That’s ok because it’s how we all learn.
- Be open to discussion and collaboration. Don’t assume any one answer or solution is the right one.
- Ask questions. If they don’t know something, odds are somebody else in the call feels the same way.
- Think of how this incident can tie in with other events at the organization; feel free to share these insights.
Suggested agenda for the meeting
Here’s a sample agenda you can use as a guide to circulate prior to the meeting.
- Intro/opening remarks to set context
- Structure of the process
- Narrative—this should be an interactive summary of what happened (calibrate)
- Themes from the interviews/questions we still have (keep these to two to five themes per incident, even if the calibration document goes through more)
- What’s been done so far?
- Next steps—(or scheduling an action items meeting)
Running the meeting
We’ve talked so far about getting ready, you’ve figured out who to invite, and sent them an agenda with materials in time for them to review it. Now you’re ready to actually facilitate the meeting.
We’ll walk you through the process step by step, starting with the introduction and some ways you’ll want to open the meeting.
Intro and opening remarks
You’ll want to set expectations. If you’re recording the meeting, make that clear and explain why. As part of those expectations, you’ll want to explain the ground rules. Here are some we recommend, feel free to take or leave them as you see fit. We’ve included our recommendation as a script for you in order to make it easy; this way, if you need a reminder or are unsure, you can start by reading the following out loud:
“The meeting is intended to be blame-aware; we recognize that everyone works with constraints and sometimes those don’t appear until after an incident, we acknowledge our tendency to blame and name names and move past it in order to be productive.10 We will avoid counterfactuals11 and remain respectful of each other’s knowledge and experience.
The objective remains how to constructively help us get better at working together to solve the often-challenging problems we face.
One of the key discoveries in learning reviews is that different responders have different knowledge. A big part of this meeting is to have an open discussion that lets us share that knowledge and identify other potential knowledge gaps to increase learning.
We don’t want to get too into the weeds. If a very specific technical conversation is better suited for another meeting or time, I’ll gently take note of it and set it aside (also known as “steer it to the parking lot”).
This meeting is also not about corrective actions; again, these are important but will go to the parking lot for tracking to be dealt with in the action Items section.
Although this meeting is for sharing the findings, we will be asking different participants to recount their experiences or share the knowledge they contributed to the meeting. Because of this, the meeting may not follow the same structure you are used to. “
You may wish to have participants write down the questions they have about the event and things they want addressed (and circle back to see if they got answered).
Start with an overview of artifacts studied (prior incident tickets, design review docs, meeting minutes) and methods (overview of the analysis process), number and scope of interviews, and how your analysis has been conducted.
Provide a narrative overview of the event. This is a short description of what happened and who was involved; include your interviewees in the narrative and have them explain from their perspective.
Next, you can address background knowledge gained during the investigation by having a subject matter expert provide some context on how the tech works so participants can learn about parts of the system they may be unfamiliar with and update their mental models.
After that, provide an overview of the themes, then ask for commentary from folks involved. Some themes may generate more commentary than others—that’s perfectly fine! Once you’ve gotten some commentary, you can choose some of the themes to take a closer look at.
As you wrap up, ask what questions may still be unresolved. If there are none from the group, you may wish to share some from the investigation. Ask the “naive” questions; while we encourage participants to do so, they may not always feel comfortable doing so.
Next, to help everyone get on the same page, state what has been done so far.
Finally, discuss the next steps. This may include scheduling an action-items meeting or moving to that section of the meeting.
Priming the discussion
The aim in the meeting is to have participants talk more than facilitators. To do this you may need to prompt different participants (as noted previously, be sure they are aware in advance that you’ll call on them!).
Some ways to generate discussion are:
- Ask some of the responders to describe the incident narrative; you may use the timeline you created to prompt discussion.
- Call on some of the subject-matter experts to explain how different parts of the system work.
- Have some of the impacted stakeholders talk about what went well and what was difficult for them (where constructive).
- Get people excited about seeing the final write-up, let them know when to expect it and how they can help make it a living document!
- Invite participants who may not have previously seen the document to provide their input.
- Let them know what other events are being investigated with the Howie process.
- Solicit feedback to encourage participants to give their comments on the process. You may wish to reach out to individuals for feedback. Getting feedback from readers helps improve future write-ups, clarify any concerns, and engage participants. Some questions you may wish to ask are:
- Was there something in this that strikes you as different from the “expected” way you’ve seen post-incident write-ups?
- Is there any information we should add?
- Will we get pushback on things in there?
- Will we get pushback on things not in there?
- Anything in there that you didn’t know before about (team, component, business unit), and if so, what?
- Is it easy enough to read that it keeps your interest after the summary bits at the top? (If not, what might make it more compelling?)
- Does this writeup lead you to ask more/different questions about various bits that are involved? (If so, what are they?)
An alternative to a 1:1 follow-up is to circulate a brief two-to-three-question survey.
Action items/countermeasures meeting
Some guides recommend splitting out the action items meeting from the learning review. The purpose of this second meeting is to review any next steps or follow-up items raised during the incident review or during the investigation itself. This meeting is where you generate your work tickets, assign that work to responsible parties, and track progression to help improve coordination of your teams and the components in your system. While we agree in theory that this is a useful practice, we’ve found that many organizations struggle to hold even just one meeting. Therefore, we offer a compromise for those who will only have one review meeting and need to include them.
An effective approach we’ve seen is when teams make the action items discussion the last section of their meeting. To avoid discussions of remediation creeping into your learning review, an important part of your role as a facilitator will be to keep each part of the meeting distinct by reminding participants that the primary goal is learning. When an action item is raised, thank the speaker, quickly make a note of it, and let them know you will refer back to it at the end of the meeting.
The key to quality action items is collaboration, ownership, and reflection. Sharing the calibration document in advance allows participants to consider the right next steps and who would be the right group of people to work on them. This approach allows time for greater forethought and more pertinent outcomes instead of shallow action items that will sit in a queue. Teams owning the work should be part of this meeting. We recommend inviting product owners, engineering managers, and project managers so that together you can all agree on the work required and the optimal time frame to achieve it.
Folks often think of action items as the end of an incident. However, your system is always changing and the action items of today can be contributing factors to the incidents of tomorrow. Therefore, refinements and learning are ongoing. This idea may be a tough sell for some organizations who just want closure from an incident. However, we believe a shift to continuous learning, along with integrating steps from this guide will better prepare you to adapt and handle future surprises.
Retro your retro!
Incident analysis is iterative, both you and your organization will get better with time and practice. Keep in mind that what works for other companies may not work for yours and that is ok!
We recommend you spend a few minutes reflecting on what went well, what could be improved, and how to handle next steps from the meeting, including follow-ups and integrating the feedback from the meeting.
If you circulated a survey, the responses can be a great starting point for this. Your own experience as a facilitator is important too! Was there a part of the meeting you felt went particularly well? Perhaps an area you felt could have gone better?
This is also when you’ll want to go over any follow-up items that you may have generated and take care of those now.