On-call slack channels are inundated with a lot of noisy alerts, often beyond the capacity of even the dedicated on-call engineer to review. Shouldn’t AI be dealing with this stuff already?
Engineers develop a sense of what’s important with tenure, and it’s not uncommon to see a ratio of 1:10 for the number of alerts that engineers pay attention to vs. the number of alerts they get in total. The reality often is that most alerts are not worth anybody’s time, not even an AI’s.
We’ve been working on this problem at Relvy. Our debugging investigation agent now comes with a quick initial assessment module, which looks at an alert’s history, seasonality and correlation to see if it’s worth a deeper look. With this capability, Relvy’s slack bot can now:
Users can of course continue to invoke RelvyAI manually for specific alerts. We are building a future in which Relvy is a dependable first line of defense for the on-call engineer, thus reducing alert fatigue and improving incident response hygiene. We expect to use the knowledge we gain from our first step here to help manage alert configurations for customers, thus improving alerting at the source.
This feature is in early access preview, with support for alerts from datadog and pagerduty slack bots. Please reach out to us at hello at Relvy dot ai if you’d like to test it out.