The journey to fully autonomous AI agents and the venture capitalists funding them

zf L/Getty Images

The term “agentic AI,” or “artificial intelligence agents,” is rapidly becoming commonplace, so much so that those invested in the technology see a need to draw distinctions. 

In a series of blog posts published last week, partners at venture capital firm Menlo Ventures, (which has bankrolled startups in artificial intelligence such as Anthropic), define “the next wave of agents” and how they surpass the agents introduced so far.

Tomorrow’s agents, they write, have four distinct capabilities.

Also: Networks of collaborative AI agents will transform how we work, says this expert

“Fully autonomous agents are defined by four elements that, in combination, ladder up to full agentic capability: reasoning, external memory, execution, and planning,” write the authors. 

“To be clear, the fully autonomous agents of tomorrow might possess all four building blocks, but today’s LLM apps and agents do not,” they declare.

The authors, Tim Tully, Joff Redfern, Deedy Das, and Derek Xiao, explore in their first blog post what it means for something to be “agentic.” The software, they write, must ultimately gain greater and greater autonomy in selecting between possible steps to take to solve a problem. 

Also: Bank of America survey predicts massive AI lift to corporate profits

“Agents emerge when you place the LLM in the control flow of your application and let it dynamically decide which actions to take, which tools to use, and how to interpret and respond to inputs,” the authors write.

A conventional large language model can have access to “tools,” such as external programs that let the LLM perform a task. Anthropic has already done this with its Tool Use feature, and OpenAI has something similar

However, the authors explain that invoking a tool merely gives an LLM means to solve a problem, not the control to decide the way a problem should be solved. 

Also: 98% of small firms are using AI tools to ‘punch above their weight’

As the authors write, “Tool use is powerful, but by itself, [it] cannot be considered ‘agentic.’ The logical control flows remain pre-defined by the application.” Rather, the agent must have a broad ability to choose which tool will be used, a decision logic. 

A few versions of software come closer to being true agents, the authors explain. One is a “decisioning agent,” which uses the large language model to pick from among a suite of rules that in turn decide which tool should be used. They cite healthcare software startup Anterior as an example of such a decisioning system.

menlo-ventures-2024-decisioning-agent

Menlo Ventures

Next, a higher-order agent, called an “agent on rails,” is “given higher-order goals to achieve (e.g., ‘reconcile this invoice with the general ledger,'” they write. The program is granted more latitude to match the high-level request and which sets of rules to follow.

Also: There are many reasons why companies struggle to exploit generative AI, says Deloitte survey

Multiple startups are pursuing this “agent on rails” approach, the authors note, including customer service firm Sierra and software development firm All Hands AI.

menlo-ventures-2024-agent-on-rails

Menlo Ventures

The third, highest level of agentic AI, the holy grail, as they put it, has “dynamic reasoning” and a “custom code generation” that allows the large language model to “subsume” the rulebook of the company. This kind of approach, known as a “general AI agent,” is still in the research phase, the authors note. Examples include Devin, the “first AI software engineer,” created by startup Cognition.

In the second blog post, “Beyond Bots: How AI Agents Are Driving the Next Wave of Enterprise Automation,” the authors reflect on how agentic AI will be applied in enterprises. 

The immediate impact, they write, is to move beyond “robotic process automation,” or RPA, tools that replace some basic human tasks with software, sold by firms such as UiPath and Zapier

Also: 73% of AI pros are looking to change jobs over the next year

The decision agents and agents on rails explored in the first post find practical applications in business tasks, such as reconciling supplier invoices to a general ledger:

Let’s say a company needs to reconcile an invoice from an international supplier against its ledger. This process involves multiple considerations, including invoice currency, ledger currency, transaction date, exchange rate fluctuations, cross-border fees, and bank fees, all of which must be retrieved and calculated together to reconcile payments. Agents are capable of this type of intelligence, whereas an RPA agent might just escalate the case to a human.

The main thrust of the blog post is that numerous startups are already selling things that approach such higher agentic functions. They “aren’t just science fiction, either,” they write. “Although the category is still emerging, enterprises from startups to Fortune 500 companies are already buying and leveraging these systems at scale.”

Also: How to level up your job in the emerging AI economy

The authors offer a handy chart of the numerous offerings, organized by the degree of autonomy of the agent programs along one axis, and the degree of vertical or horizontal-market focus:

menlo-ventures-2024-ai-agent-marketplace

Menlo Ventures

Not covered in the two blog posts are two key limitations that have cropped up in existing generative AI (gen AI) systems and threaten to stymie the progress of agents. 

First, there is no substantial discussion by the authors on how to deal with hallucinations, confidently asserted false output. Whatever the reasoning process used by gen AI, and however formidable the tools, there is no reason to suppose that AI agents won’t still generate erroneous outputs like conventional chatbots. 

Also: Prepare for AI-powered ‘agent ecosystems’ that will dominate tomorrow’s services

At least, the question of whether or not decision agents and agents on rails diminish hallucinations is an open research question. 

Second, while agentic AI can conceivably automate a number of corporate processes, there is to date very little data on the effect of that automation and whether it is truly an improvement. That is partly connected to the first point about hallucinations, but not entirely. An agent that is not wrong in its reasoning or actions can still lead to outcomes that are suboptimal versus what a person would do. 

A prominent example is discussed in the book, “AI Snake Oil” by Princeton computer science scholars Arvind Narayan and Sayash Kapoor, published this month by Princeton University Press. An AI model tracked the history of patients with asthma who presented with symptoms of pneumonia when entering the hospital. The AI model found they were among the patients with the lowest risk in the hospital population. Using that “reasoning,” such patients could be discharged. 

Also: Asking medical questions through MyChart? Your doctor may let AI respond

Yet, the model missed the causal connection: patients with asthma and symptoms of pneumonia were least risky because they received emergency care. Simply discharging them would have bypassed such care and the results could have been “catastrophic,” Narayan and Kapoor declare.

It’s that kind of correlation instead of causality that can lead to vastly sub-optimal results in real-world situations with complex causal situations.

Also left out of the authors’ scope of discussion are agents that collaborate. As Hubspot CTO Dharmesh Shah told ZDNET recently, the future work of agentic AI will not be done by a single agent but likely by networks of AI agents collaborating with one another. 

Also: AI is relieving therapists from burnout. Here’s how it’s changing mental health

Given those omissions, it’s pretty clear that despite the sweep of the venture capitalists’ research, they have only scratched the surface of what will be achieved in a world of increasingly powerful AI agents.





Source link

Leave a comment