Our Approach to AI Incident Management

Published on
August 10, 2023
Author
Michael Donlin
Michael Donlin
Engineering Manager
Share This
We're building AI into our platform to assist humans, not displace them.

We recently released multiple “AI” (large language model) powered incident response and analysis features and began collaborating with our customers to define the future of AI for incident management.

During an incident, as new stakeholders or responders join, Jeli will automatically and succinctly summarize the incident based on status updates previously provided by responders. After the incident, Jeli will parse and classify the incident data to create a draft incident story and timeline to prompt discussion and further investigation. As users review and assemble evidence that has been automatically ingested into Jeli, they can now automatically summarize collections of related evidence to help communicate to others what occurred.

Our AI features will not resolve your incidents for you or replace the learning (and competitive advantage) that results from a practice of incident analysis. These features - as with the rest of our product - are driven from our vision of Jeli as a helpful teammate working alongside human responders.

The prevailing perspective I’ve seen sees incidents as a chore, one we should spend as little time and effort thinking about as possible. The goal then, is to automate away as much as we can. I’ve seen “AIOps” marketing material claiming to automate incident analysis in as little as 5-10 seconds, for example – the anti-learning approach. As your incident assistant, we make managing incidents easier and more efficient, so you can focus on the difficult work of diagnosis and repair. After the incident, we start from the premise that post-incident investigations are worth the investment, not toil to be (or that can be) automated away.

We want to help you respond to incidents and to support the generation of insights from incidents, and recognize we can’t do these things without you, and no product out there can, despite the hype.

But should we even AI?
"Never trust anything that can think for itself if you can't see where it keeps its brain." - Arthur Weasley

When starting on these features, I asked our engineers to rate themselves on an AI skepticism scale from 1 to 10, where 1 is something like “AI will end all suffering and usher in Utopia” and 10 is “actively harmful or a useless distraction at best”.

Most of us placed ourselves around a 7.

This isn’t just Luddite naysaying. The existence of and use of LLMs have an environmental cost, reinforce and propagate biases, and can be used to generate misleading or harmful information1. The sales and marketing of “AI” often devalues humanity and reinforces a white racial frame2. Further, LLMs are essentially spicy autocomplete, not capable of understanding language or meaning3 and frequently spouting falsehoods4.

Accuracy of responses for LLMs are not and likely never will be 100%. If you are using AI in your organization for incident response and it is right 95% of the time, you will likely begin to trust it. What’s the impact of the 5% of the time it’s wrong?

We are excited about how these new LLM-powered features can help our customers save time and prompt a deeper investigation into incidents. We have more powerful features on the horizon to help surface organizational pain points across the vast quantities of incident data we process, and we are careful to think about the implications of such features given the limits of the technology we’re working with.

Short notes on our experience building with LLMs

Jeli does not have a team of data scientists (yet), and we were able to spin up multiple AI features quickly by leveraging generative models from OpenAPI. Easy to implement, frustrating to work with.

As we learned how to craft our prompts, we found ourselves fighting with the model, thinking we were giving clear direction to have it return responses completely ignoring our prompts.

Generative AI models are also designed to give an answer, and given limited inputs will respond with content unrelated to the task at hand. The following brief catch-up is not something you should hopefully ever see joining a software incident response channel.

We did find ways to improve the accuracy of responses to make our features workable, however. For example, chain of thought reasoning5,6 within our prompts to classify incident transcripts resulted in a clear improvement.

Our friends at Honeycomb.io wrote a blog post7 that was a favorite internally as we discussed building these AI features. They discuss challenges we encountered as well around context windows and more discussion on the new wild west of prompt engineering.

Partnering with customers

Ahead of the release of these features, we assembled a Customer Advisory Board (CAB) to discuss the use of LLMs and other AI-type tools, potential use-cases for these tools in incident management, and Jeli’s approach to AI that we’ve outlined in part here.

Many of our customers are as excited as we are about the possibilities for these features in the product. Together over the next 6 months we’ll be co-creating Jeli’s AI roadmap together through new ideas in discussion and feedback on our features as they release. These discussions have also surfaced security and compliance concerns our customers have for all of their vendors as they begin to roll out AI features into their products. Jeli’s AI features require customers to opt-in before they are available for use in the product. Additionally, we are clear about precisely what type of data is sent and where we send it. Jeli does not use customer data for training models, and the data we send to OpenAI is also not used for training.

AI is affecting us whether we like it or not
“During my lunch walk just after our AI chat, I got to a crosswalk at the same time as a self-driving car and held my breath as I crossed in front of it, trying to be the most predictable pedestrian I could possibly be and hoping my behavior matched up to whatever model/algorithm the car was running on (!!!)” -Jeli Engineer

A coworker made an observation that in our use of AI features internally, the model had started training us to give it better data.

There are quirks with LLMs, and these were especially evident as we started out building these features. We found ourselves thinking “what can I give this machine to make it more predictable”?

We need to understand “AI” - it’s in use around us and will impact us in one way or another. We can leverage these tools to help organizations build better practices of continuous improvement and more effectively respond to their incidents. We also have to be responsible – be thoughtful with our use cases, think about the implications of customers relying on these features, and be grounded in how we communicate their benefits and their shortcomings.

Sign Up for the Private Beta Waitlist

https://www.jeli.io/ai-waitlist

References

1. Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? 🦜. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–623.

2. Bender, E. M. (2022). Resisting Dehumanization in the Age of “AI.” 

3. Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5185–5198.

4. O’Brien, M. (2023, August 1). Tech experts are starting to doubt that ChatGPT and A.I. ‘hallucinations’ will ever go away: ‘This isn’t fixable.’ Fortune. https://fortune.com/2023/08/01/can-ai-chatgpt-hallucinations-be-fixed-experts-doubt-altman-openai/

5. 5 learnings from classifying 500k customer messages with LLMs vs traditional ML. (n.d.). Retrieved August 7, 2023, from https://www.trygloo.com/blog/classify-text-llms-learnings

6. Sun, X., Li, X., Li, J., Wu, F., Guo, S., Zhang, T., & Wang, G. (2023). Text classification via large Language Models. In arXiv [cs.CL]. arXiv. http://arxiv.org/abs/2305.08377

7. Carter, P. (2023, May 26). All the Hard Stuff Nobody Talks About when Building Products with LLMs. Honeycomb. https://www.honeycomb.io/blog/hard-stuff-nobody-talks-about-llm