AI Agents + why onboarding is crucial

Written by

Every company has embarked on one or more LLM (Large Language Model) deployments. These can be for content generation, customer support, code assistance to document processing to name a few. AI Agents are now able to autonomously conduct multi-turn queries and actions to iteratively solve and act on given problems. But the big question: How should we start using them?

In this update to our post, over the past months, especially for those larger organizations, we’ve seen huge improvement in aligning to customer goals when agentic AI is correctly onboarded to ensure the right and most relevant data sources are consumed by the model being used. While also working to align the agent’s runbook for each client. In this updated post, we’ll talk about the different ways to start using AI Agents, and we’ll conclude with the importance and value of onboarding to a specific client or client use case.

In terms of agentic AI, we see three main approaches companies are taking:

‍1) Build It Yourself Using Commercially Available LLM APIs Or Open-Source LLMs

When it comes to agents, the main issues to consider are the costs of external LLM API calls. Since they charge based on every query conducted. But for AI agents which require multiple-turn queries to reason a result the costs can be high. For example, running AI agents that conduct root cause analysis of a reported software issue can prove to be more than 10X the cost of running single-turn chatbots. From our own experience, we see them needing to perform 10 or more queries/ actions before achieving a solution. Open Source LLMs such as from Meta with Llama or Mistral are great choices to completely control and maintain security for an LLM project. Additionally, since the LLM is running on your own infrastructure you don’t have the API markup cost. There is a notable infrastructure cost while these models run however. A challenge to building AI Agents internally is the required expertise in fine tuning and maintaining the LLMs. Working with smaller language models that have less than 8 billion parameters has a big cost savings in operation but they are unlikely to work well out of the box, thus needing highly specialized ML expertise.

‍Advantage:

A great way to safely and securely introduce your company to AI Agents. You control everything.

‍Disadvantages:

Higher cost of LLM for external APIs or infrastructure cost if hosted internally.
Higher cost of personnel needed to maintain/ fine-tune.
Could be disadvantageous in the long run to the next two options.

‍2) AI Agent products backed by Large Language Models

Lately, there’s been a number of recently funded AI Agent vendors who focus on replicating many of the tasks for a particular employee’s job.. They typically tune and test on commercial LLMs such as OpenAI’s GPT or Anthropic’s Claude, and are accessed as a SaaS application. This is required to enable the replication of much of what an employee can do, such as the skills and competencies of a Software engineer. You can generally try these out as a POC and confirm the potential ROI. It’s the vendor’s own expert personnel maintaining it and the potential ROI can be amazing. The main challenge is these AI Agent vendors need a great deal of data and access to many of your systems to achieve results and operating cost of any LLM used is commensurate with the number of parameters. In this case, such an application may need 100’s of billions of parameters or more to be effective.

Advantages:

You don’t need the time, effort or personnel to get started right away running an AI Agent.
The ROI can be established via a free trial / limited POC.

Disadvantage:

Costs may be substantial.
More preparation/ clearance with your security to gather a great deal of internal data and content. This is only a problem if an effective ROI can’t be established.

3) AI Agent products backed by Small Language Models

Some vendors today focus on specialized AI Agents by replicating only one small aspect of a specific employee’s job. For example, automating root cause analysis in debugging software. These vendors use Small Language Models (SLMs). SLMs are generally 8 billion or less parameters and can be run on as little as one g2-standard-8 running on Google Cloud Platform (GCP) for ~$0.80/ hour of operation. Compared to larger LMs this is a fraction of the operating cost. Additionally, fine-tuning/ data collection is simpler with this smaller use case, smaller dataset requirement. Though they do still require a great deal of initial training and fine-tuning by skilled personnel for them to be effective and consistently reliable.

‍

Advantages:

You don’t need the time, effort or personnel to get started right away running an AI Agent.
The ROI can be established via a free trial / limited POC.
The cost of running a highly optimized SLM either on your infrastructure or as a SAAS solution is considerably less than an LLM.
Training / learning is simpler with a smaller required dataset.
It’s easier to gain permission from your management while still having a great potential ROI.

Disadvantage:

If the SLM is well optimized for the specific use case sought and the vendor's team has the skill to fine-tune to the required expectation there is really no disadvantage compared to the above two options.

‍4) The power of correctly onboarding clients to your AI Agent. Replicating an on-call engineer’s skill.

As we’ve worked with a number of clients over the past year, we’ve come to appreciate the power of aligning our AI agent to the exact use case of the customer, especially those companies with 200 or more full-time engineers. We’ll be posting more on this, but in our case onboarding for us aligns our AI agent to fully replicate the troubleshooting and debugging prowess of our client’s best on-call engineer.

In our onboarding process in addition to connecting to various data sources such as Amazon Cloudwatch, Datadog, Observe, Slack and others we have a second step which encourages our clients to enter how their on-call engineer tackles issues as well as most importantly, which data sources are used to do so. We’ve now found that client success and our success in effectively replicating a client’s on-call engineer process is to schedule a few minute walk through of those data sources and processes resulting in us creating a customer specific runbook - such as a high level summary with a few bullet points on the client’s debugging process.We’ve found this runbook is best created by starting with our client’s own runbooks which we can automatically ingest from Confluence or other document sources as well as post-mortem notebooks. We’ve found by letting Relvy run against these data sources, runbooks and post-mortems and actively conferring with the customer as we watch the results received over a short period lets us see the onboarded Relvy before the client goes live or begins a POC with us. This ensures Relvy/ our AI agent behaves as an on-call engineer would for the customer’s specific environment following the customer’s specific troubleshooting methodology. We find this process frequently helps us uncover if we in fact have complete access to all required customer data sources to make the customer experience a full success.

At this point, we can fully roll out Relvy / our AI agent to the customer.

Advantages:

We and the customer are sure Relvy is completely onboarded and acting just as one of their expert and experienced on-call engineers would behave.
We and the customer are sure all the data sources and documentation has been calibrated within Relvy.
Even though this process can take a few cycles of back and forth between Relvy and the customer, ultimately for 200 person engineering teams or larger it saves time getting to that more ideal functionality.

Disadvantage:

It may not be necessary for smaller engineering teams, they can simply add the data sources themselves and have a completely self-service experience.

We would love to hear your thoughts and experiences deploying AI Agents in your company. A lot of changes to come so we’ll update this post as we see new capabilities or new types of vendors emerge.

About Relvy

We’ve paired our cost effective custom tuned language models which operate at 1/200th the cost of existing foundational models to make 24/7 agentic AI monitoring and debugging a reality. Get started instantly and see how Relvy can drastically reduce debugging time and costs, transforming your engineering processes today.

‍https://www.relvy.ai/get-started

‍