Every company has embarked on one or more LLM (Large Language Model) deployments. These can be for content generation, customer support, code assistance to document processing to name a few. AI Agents are now able to autonomously conduct multi-turn queries and actions to iteratively solve and act on given problems. But the big question: How should we start using them?
We see three main approaches companies are taking:
When it comes to agents, the main issues to consider are the costs of external LLM API calls. Since they charge based on every query conducted. But for AI agents which require multiple-turn queries to reason a result the costs can be high. For example, running AI agents that conduct root cause analysis of a reported software issue can prove to be more than 10X the cost of running single-turn chatbots. From our own experience, we see them needing to perform 10 or more queries/ actions before achieving a solution. Open Source LLMs such as from Meta with Llama or Mistral are great choices to completely control and maintain security for an LLM project. Additionally, since the LLM is running on your own infrastructure you don’t have the API markup cost. There is a notable infrastructure cost while these models run however. A challenge to building AI Agents internally is the required expertise in fine tuning and maintaining the LLMs. Working with smaller language models that have less than 8 billion parameters has a big cost savings in operation but they are unlikely to work well out of the box, thus needing highly specialized ML expertise.
Advantage:
Disadvantages:
Lately, there’s been a number of recently funded AI Agent vendors who focus on replicating many of the tasks for a particular employee’s job.. They typically tune and test on commercial LLMs such as OpenAI’s GPT or Anthropic’s Claude, and are accessed as a SaaS application. This is required to enable the replication of much of what an employee can do, such as the skills and competencies of a Software engineer. You can generally try these out as a POC and confirm the potential ROI. It’s the vendor’s own expert personnel maintaining it and the potential ROI can be amazing. The main challenge is these AI Agent vendors need a great deal of data and access to many of your systems to achieve results and operating cost of any LLM used is commensurate with the number of parameters. In this case, such an application may need 100’s of billions of parameters or more to be effective.
Advantages:
Disadvantage:
Some vendors today focus on specialized AI Agents by replicating only one small aspect of a specific employee’s job. For example, automating root cause analysis in debugging software. These vendors use Small Language Models (SLMs). SLMs are generally 8 billion or less parameters and can be run on as little as one g2-standard-8 running on Google Cloud Platform (GCP) for ~$0.80/ hour of operation. Compared to larger LMs this is a fraction of the operating cost. Additionally, fine-tuning/ data collection is simpler with this smaller use case, smaller dataset requirement. Though they do still require a great deal of initial training and fine-tuning by skilled personnel for them to be effective and consistently reliable.
Advantages:
Disadvantage:
We would love to hear your thoughts and experiences deploying AI Agents in your company. A lot of changes to come so we’ll update this post as we see new capabilities or new types of vendors emerge.
Our mission at Relvy is to automate software reliability and empower software engineers so they can spend more time innovating. Your team is spending too much time and resources troubleshooting your software. Let our AI agent do the work.