More and more observability companies are introducing GenAI to reduce the efforts of on-call engineers maintaining production software. Relvy released its own troubleshooting platform to discover and cooperatively debug software incidents. We see we’re moving towards not just saving engineers time on debugging, but prevention of software downtime. In 2024 this is a cost to Global 2000 companies of $400 billion per year.
Troubleshooting today.
We see more and more existing observability companies like Datadog along with newcomers built on top of GenAI are seeking to reduce DevOps spend and reduce engineer fatigue debugging and maintaining production software. Savings to engineering personnel and reduced time to finding root cause of incidents is the big goal. Since site reliability and on-call engineers must spend on average 30 minutes debugging production incidents these GenAI Ops systems can reduce investigations to an incident’s root cause to minutes. Engineers can instead spend more time on innovating their companies infrastructure and building new software.
Enter automated, 24/7 discovery and debugging of software incidents. Full troubleshooting.
Relvy released it’s own agentic AI automated software troubleshooting platform with the ability to monitor, discover and automatically debug production incidents 24/7; a new paradigm in troubleshooting is here. Relvy doesn’t just benefit SREs and on-call engineers by automating debugging, Relvy automates and greatly reduces escalations and downtime of software systems themselves by sorting through the alert noise and uncovering real incidents so they can be dealt with in a timely fashion.In addition to automatically querying and sorting alerts, Relvy incorporates an accompanying co-pilot that lets engineers ask questions and direct Relvy in the debugging process as it’s happening. According to a recent study from Oxford Economics and Splunk, the growing problem facing Global 2000 companies is the amount they spend as a result of software downtime; $400B a year. This was a survey conducted with CFOs, CMOs who could speak to the financial and brand fallout of experiencing such outages. This represents 8.89% of the average annual profit or just under 1% of annual run rate (0.774%) of Global 2000 companies.
What effect could automated, directed debugging and 24/7 discovery of software incidents have on this downtime cost? Now? And in the future?
As the role of the software engineer changes and efficiency in writing code expands through AI tools like Github Copilot or Cursor it’s generally believed the sheer amount of software will grow as will its complexity. We believe it will become necessary to use AI to not only write code but be used to identify and debug potential issues.
The Oxford Economics study breaks down the different costs that make up this total $400B.
48% Lost revenue
11% Regulatory fees
8% SLA Penalties
7.5% Settlement / legal costs
7% Brand trust campaign
6.5% Lost productivity
5.5% Ransomware payouts
5.5% Additional infrastructure capacity
5.5% Overtime wages
5% Cyberinsurance premiums
4.5% Recovering from backups
4% Extortion payments
Though “5.5% Ransomware payouts” and “4% Extortion payments” may occur as a result of compromised employees or phishing attacks, that leaves 90.5% of these outage costs may be greatly curtailed if issues are automatically discovered and debugged. This still represents over $362B every year, a huge savings for the bulk of revenue earning companies.
About Relvy.
We’ve paired our cost effective custom tuned language models which operate at 1/200th the cost of existing foundational models to make 24/7 agentic AI monitoring and debugging a reality. Get started instantly and see how Relvy can drastically reduce debugging time and costs, transforming your engineering processes today.
https://www.relvy.ai/get-started