- Building AI Agents
- Posts
- OpenAI to release "Ph.D.-level super-agents"
OpenAI to release "Ph.D.-level super-agents"
Plus: 5 agents thousands of customers need, how agentic AI is transforming SaaS, and more

Welcome back to Building AI Agents, your biweekly guide to everything new in the AI agent field!
I put together a comedian agent
that tells jokes to a room full of 1,000 other agents
he's been going on for 9 hours straight
they're all having a blast, kind of jealous ngl
— Sam Woods (@samuelwoods_)
8:00 PM • Jan 14, 2025
Someone get this guy a Netflix special ASAP. He might become the second AI agent millionaire, but only if he has a good…agent.
In today’s issue…
Is a major agent release coming?
5 agents thousands of customers need
Service as software: how agentic AI is transforming SaaS
Microsoft, Salesforce, and NVIDIA CEOs on agents
A deep-dive on agentic RAG
…and more
🔍 SPOTLIGHT

Source: Flickr
In November, inside sources hinted at the arrival of a powerful new agent to be released by OpenAI in January. Now, that agent may be almost here—and its capabilities could be considerable.
A new report by Axios suggests that a major AI lab is preparing to release what the report dubs a ”Ph.D.-level super-agent”—with intelligence significantly exceeding those of existing models—in the next few weeks. While the identity of the company is not certain, with the piece only citing rumors among top AI researchers, it suggests that it is likely OpenAI. The timeframe given for the release would fit with an article published in November by Bloomberg, which claimed OpenAI was preparing to release a new computer-controlling agent called Operator in January. According to Axios, OpenAI CEO Sam Altman has scheduled a closed-door briefing with U.S. government officials on January 30.
The internet rumor mill was quick to speculate that Altman’s meeting indicated that the coming announcement had national importance, but the timing could be more coincidental—Altman is in Washington, D.C. for Donald Trump’s inauguration, and a meeting with top government figures would be entirely in keeping with a tech CEO seeking to build a relationship with the new administration.
Nevertheless, there are other reasons to believe the announcement could potentially be momentous. OpenAI recently announced its powerful new o3 reasoning model, which achieved impressive results across a range of benchmarks—particularly on ARC-AGI, a challenging measure of general intelligence previously intractable to AI. According to the Axios article, OpenAI staff have been “both jazzed and spooked by recent progress”, and national security adviser Jake Sullivan, privy to many inside discussions about AI capabilities, warned of the “catastrophic” risks upcoming models could create if not managed carefully. Nevertheless, Sullivan also described himself as “not an AI doomer”, suggesting that his concerns were closer to job displacement than existential risk.
The picture that emerges from insiders’ impressions of coming AI capabilities is one of radical disruption of the existing economy, as systems capable of operating at Ph.D. level rapidly automate some jobs, while creating others, and leaving behind those who fail to adapt quickly. It remains to be seen to what extent the hype pans out, but agent builders and business leaders should monitor these developments carefully. The risks—and opportunities—are too great to ignore.
If you find Building AI Agents valuable, forward this email to a friend or colleague!
🤝 WITH GAMMA
The future of presentations, powered by AI
Gamma is a modern alternative to slides, powered by AI. Create beautiful and engaging presentations in minutes. Try it free today.
📰 NEWS

Source: NVIDIA
The new NVIDIA inference microservices (NIMs) aim to protect enterprise agents against generating “harmful” outputs, wandering off topic, or falling victim to jailbreak attacks, three common failure modes for AI agents.
YC-backed startup Skyvern achieved state-of-the-art results on the WebVoyager benchmark for web browser agents, dethroning Google’s Mariner.
🛠️ USEFUL STUFF

Source: Created by the author using Dall-E 3
This GitHub repo by OpenAI provides an example of an agentic app built on top of the company’s Realtime API, enabling live voice conversations between the agent and a user.
AgentOps CEO Alex Reibman provides a list of the most in-demand agentic apps, identified by a survey the startup made of thousands of customers.
Dria-Agent-α is a new large language model purpose-built to act as the core of an agent which acts by writing custom Python code, rather than simply calling pre-specified functions, vastly increasing its action space.
💡 ANALYSIS

Source: The New Stack
This piece describes how agents are turning the traditional Saas business and pricing model on its head—instead of companies paying for software that allows their employees to achieve outcomes, the outcomes themselves are the service provided by agents.
Major tech CEOs—Microsoft’s Satya Nadella, Salesforce’s Marc Benioff, and NVIDIA’s Jensen Huang—have all weighted in on the AI agent boom, offering various predictions on the future of the technology and the ways in which their respective companies will shape it.
The author of this article speaks with companies that have already achieved considerable cost and time savings by implementing agentic AI.
An exposition on the implications of Jensen Huang’s “100 million [agents] in every group” claim about the future of enterprise AI, exploring what the world could look like when agents are a ubiquitous part of every business.
🧪 RESEARCH

Source: arXiv
Agent-powered retrieval-augmented generation (RAG) is displacing its “naïve” predecessor due to its superior ability to identify relevant documents. This paper gives an overview of both methodologies, as well as a deep dive into the various emerging subtypes and their advantages and disadvantages.
The authors of this paper review efforts to endow LLM agents with “lifelong learning”—the ability to continuously update their knowledge and behavior based on their experiences, a crucial feature of artificial general intelligence.
Thanks for reading! Until next time, keep learning and building!
What did you think of today's issue? |
If you have any specific feedback, just reply to this email—we’d love to hear from you
Follow us on X (Twitter), LinkedIn, and Instagram