Two ways to make AI agents suck less
ai
Feb 5, 2025
Despite widespread hype, AI agents currently struggle with reliability issues, but new integration frameworks and API toolkits are emerging as practical solutions to make the technology more dependable for everyday use.
Everyone won’t stop talking about it. Billions of dollars are being poured into it. Ground is being broken worldwide to build enormous data centers just to keep it running. AI has been widely anticipated to transform industry forever. And, to some extent, it already has—although not at the pace that some have hoped (or feared).
For software engineers in particular, AI agents have been positioned as a cure-all for tedious coding tasks, debugging, and even decision-making. But there’s one tiny problem: they don’t always work. They’re bogged down by hallucinations, inconsistent function execution, integration issues, and other problems we’ll explain better shortly. Does this mean that AI agents are a flop? Of course not.
Keep in mind, when a few Frenchmen had the idea to attach pedals to the bicycle in the 1860s, they didn’t revolutionize transportation overnight. Their initial design was known as the “boneshaker” because its wheels were made of wood and metal—which delegated the task of suspension to the rider’s skeleton. Only decades later would the pedal bike be refined into more or less what we ride today.
Fortunately, the pace of technological development is more rapid today than it was in the 1860s. Developers don’t need to wait decades before AI agents are refined. In fact, there are already two ways to make AI agents more reliable:
#1: Use AI agent integration frameworks
Say you hire an intern for a development job and, on day one, you tell them to go improve your company’s proprietary code. Now, instead of a functioning solution, you have a big, beautiful 404 screen for your clients to enjoy. AI agents are a bit like interns. And, like interns, they need some onboarding before they’re trusted to pull strings.
Now, AI agent integration frameworks can be seen as the onboarding element here. These platforms provide parameters, execution flows, and validation layers—think of these as rules, in plain English—that make AI agents more predictable.
Take an AI agent tasked with interacting with Gmail. If you were to just set it loose and say “hey, do email!”, it will only succeed 40-50% of the time at best. Otherwise, it’ll send emails to the wrong people, land you in spam boxes or, God forbid, reply all. An AI integration framework would fix this by establishing clear rules for the agent to follow, such as: who to email, when to email them, what information shouldn’t go into an email, and so on.
Integration frameworks exist because AI agents can’t consistently execute tasks in real-world environments without guidance. They might be able to individually generate code, fetch data, or interact with APIs—but without clear instructions, it’s all just educated guessing. Again, the agent might do too little, too much, or something completely unrelated, like responding to a calendar invite with a grocery list.
To sum up, integration frameworks set rules that keep AI agents on-task. Because, let’s face it, nobody wants an assistant that replies all.
#2: Use AI agent toolkits from API providers
As we’ve established, AI agents need rules to navigate your infrastructure. Otherwise, you’re asking for a bull-in-a-china-shop scenario. But what happens when these same agents need to interact with external infrastructures—specifically, APIs? The bad news is that they need to do this all the time. The even worse news is that they’re terrible at it.
Left to their own devices, AI agents are often forced to guess how an API works. And guessing is not good enough. That’s where AI agent toolkits come into play, making API interactions more reliable. Let’s break it down.
Think of APIs as languages that allow AI agents to communicate with external applications, such as payment gateways, databases, and so on. Since APIs are essentially a foreign language to AI agents, they may not understand key rules. Or, worse yet, they may think they know them when they actually don’t. AI agent toolkits eliminate this uncertainty, acting as a phrasebook that ensures the agent "speaks" the API’s language correctly. So, instead of gibberish, it delivers structured requests that get the right response every time.
Some major API providers have already figured this out. For instance, Stripe offers a toolkit for its payments API, telling AI agents which exact functions to call (rather than them trying to generate requests or make sense of documentation on their own). Coinbase similarly offers an AgentKit to enable reliable crypto transactions executed by agents. The idea behind such AI agent toolkits is that they’re purpose-built to work with specific APIs. This, in turn, makes integration a lot more feasible.
AI agents: Less hype, more practical execution
We know there’s a lot of hype around the transformational potential of AI. Some of it is disturbingly utopian, promising infinite productivity and a world where no one ever works again. Some of it is equally apocalyptic, warning that humanity is doomed to a mindless Wall-E-like purgatory where machines take over.

But even a cursory look at where AI agents are so far—and the sheer amount of manual intervention they require to function—suggests that the hype is overblown. When bicycles first became popular, doctors warned that excessive riding would permanently disfigure one’s face (for dubious and deeply problematic reasons). New technology doesn’t just have practical limits, it stirs moral panic. Even today, a proposed bike lane can trigger visceral political reactions.
For AI agents, the issue isn’t binary—whether they’ll take over the world or be completely useless. It’s how to make them actually work in practice. And, right now, AI agents fail because they hallucinate, misfire, and integrate poorly. Are integration frameworks and agent toolkits the end-all solutions? Not necessarily, but they’re certainly a step in the right direction.
AI may seem inhuman, but it won’t improve without the inherently human process of iteration—something that’s no less important today than it was in the 1860s. So, if humans could figure out that bicycle tires should be made of rubber instead of wood, we’ll find a way to stop AI agents from hitting ‘reply all.’