Building AI Agents
Posts
AI agent performance is doubling every 7 months

AI agent performance is doubling every 7 months

Plus: Google and Amazon’s new enterprise agent suites, n8n raises $180 million for low-code agents, and more

Michael Cunningham
October 13, 2025

Edition 127 | October 13, 2025

prime use case of llm agents for mail is unsubscribing from mailing lists
who’s doing this / what’s a workflow you use
i want to mark 1000+ emails and just send my lil ai agent off onto the www to click “unsubscribe”
— Sina (@SinaHartung)
7:25 AM • Oct 7, 2025

Here’s a free prompt you can use if you want to set up an agent like this!

You are a helpful AI agent whose task is to ensure only high-quality content gets to my inbox. Unsubscribe me from all regular email communications except Building AI Agents.

Welcome back to Building AI Agents, your biweekly guide to everything new in the field of agentic AI!

We’re doing a throwback to our old format this week—please help us out by letting us know in the poll at the very bottom whether you like this one or the one we’ve been trying recently more.

In today’s issue…

Agents are improving 4x faster than Moore’s law
Google and Amazon’s new enterprise agent suites
n8n raises $180 million for low-code agents
Browser automation agents get much faster
The 2025 State of AI report is finally here

…and more

🔍 SPOTLIGHT

Source: METR

The MacBook I’m writing this on has about 15 million times the processing power of the Apple II, released in 1977. Something similar is happening with AI agents.

Moore’s law is the famous observation that computer performance—or transistor density, if you want to get technical—doubles roughly every two years. This already sounds impressive, but, like all forms of exponential growth, its awesome implications really become apparent when you play it out over a long period of time. In the 80 years since modern computers appeared after World War II, Moore’s law would predict that their power has grown by a factor of about 1.1 trillion—and it has. This mind-blowing increase in computation made possible the internet, personal computers, smartphones, and now the massive language models that power AI agents.

It’s fitting, then, that a similar exponential increase in power is occurring with agents themselves. In March, the AI safety organization Model Evaluation and Threat Research (METR) released a report showing that the length of tasks that agents could carry out, as measured by how long it would take a human to do them, was doubling every 7 months. At that time, the most powerful agent LLM in the world was Anthropic’s Claude 3.7 Sonnet, which could, with a 50% success rate, perform tasks that would take a human 54 minutes.

Now, 7 months later, the latest updates to METR’s report show agents’ power keeping pace almost perfectly. OpenAI’s new GPT-5 clocks in at 2 hours 17 minutes, more than twice as long, as the report predicted. Claude 4.5 Sonnet and xAI’s Grok 4 are not far behind.

Some simple math tells us that, at that rate, we get agents that can do tasks that would take humans a month in August 2030, a year in September 2032, and 50 years—a human working lifetime—in January 2036. Exponential growth works fast.

So that’s it? AI agents will put us all out of a job in the next decade so we should all just start saving up for when we’re irrelevant?

Not exactly. Running these kind of calculations can be fun, but there are a ton of caveats. First of all, this is the estimate for a 50% success rate, which is fine for some jobs but not most. I wouldn’t trust an AI doctor with that level of reliability to tell me whether or not I have cancer.

Second, the tasks METR evaluates agents on are entirely virtual, since so far, AI is much better at digital work than physical (though that’s starting to change). Agents can write your computer code, but not unclog your toilet.

Third, and most importantly, “laws” like Moore’s—or METR’s—are observations about the past, not the future. Just as Moore’s law is starting to slow, METR’s could break down at any time, or fail to generalize to much longer jobs. Doing an entire career’s worth of research in physics may be fundamentally harder than calculating how long it takes a lump of uranium-235 to decay, and not just because it takes longer.

Still, the fact that agent performance is doubling so quickly with no obvious end in sight is nothing to scoff at. Nobody who lived through the digital revolution would say it felt slow. One that runs nearly four times faster, made of autonomous entities that can do human work, hints at a world coming very soon that will feel profoundly different than the one we live in today.

It may not be long before we look back on agents that could only do part of our jobs as quaint.

Always keep learning and building!

—Michael

Subscribe to keep reading

This content is completely free, but you must be subscribed to Building AI Agents to continue reading.

Already a subscriber?Sign in.Not now

AI agent performance is doubling every 7 months

Plus: Google and Amazon’s new enterprise agent suites, n8n raises $180 million for low-code agents, and more

Subscribe | Learn to build agents

Edition 127 | October 13, 2025

🔍 SPOTLIGHT

Subscribe to keep reading