Building AI Agents
Posts
Is Strawberry OpenAI's new super-agent?

Is Strawberry OpenAI's new super-agent?

Reports of a secret breakthrough by the company, cloud providers add new agent capabilities, and more

July 15, 2024

🔍 Spotlight

OpenAI is developing an advanced agentic AI system code-named Strawberry, according to a new article by Reuters. An internal source and documents seen by the news agency indicate that the project is an evolution of the mysterious Q* (pronounced “Q-star”) reported last year, and may include the ability to directly control users’ computers to accomplish agentic tasks.

In November of last year, during the fallout from Sam Altman’s brief firing from OpenAI by its board, a set of reports by The Information and Reuters hinted that his ouster may have been due to concerns about the dangers of a new AI system developed internally called Q*. The board, it was theorized, had fired Altman out of fears that the technology was sufficiently powerful to represent a possible danger to humanity and should not be in his control. This unleashed a flurry of guesswork as to the nature of Q*, with some hypothesizing that it is a fusion of the reinforcement method Q-learning and the graph traversal algorithm A* search. While more sober heads downplayed the idea of Q* as a potential species-destroyer, pointing out that it was only currently thought to be capable of solving grade-school math problems, it remained the subject of fevered speculation.

Now, an exclusive report published by Reuters on Friday has reignited discussion over Q*, and provided further hints as to its identity. According to the article, Q*, now renamed Strawberry, is similar to a method published by Stanford researchers in 2022 called Self-Taught Reasoner, or STaR, which allows an LLM to iteratively self-improve by generating synthetic training data. It reportedly provides the backing for a “computer-using agent” which can browse and interact with the web—seemingly an OpenAI synonym for computer control agents, an area of intense research in the field of autonomous agents.

If Strawberry lives up to its promise, it could represent a quantum leap in agent capabilities. OpenAI has been signaling in private pitches to outside companies that it will soon be releasing technology with significantly improved reasoning abilities, per the article, potentially presaging a major advance over current LLMs, which can produce fluid text but struggle with long-term reasoning and planning. Developers of agentic systems may soon find a powerful new tool in their arsenal, provided, of course, that it falls short of the wildest rumors and human life on earth remains safe.

📰 News

Major cloud providers double down on agents

Amazon Web Services (AWS) announced upgrades to the memory and code interpretation capabilities of its Bedrock agents this week, bringing them into line with similar offerings by Microsoft’s Azure and Google’s GCP. The new features represent the latest in the three cloud providers’ race to provide agentic services to customers.

An agent hackathon by LangChain and others

LangChain is teaming up with Fireworks AI and Factory AI to host an agent hackathon on August 11 in San Francisco, featuring a $3,000 cash prize pool and over $6,000 in credits for Fireworks and LangSmith.

Enso launches with $6 million to build agents for smaller businesses

AI agents have incredible potential to enhance business workflows, but their use has largely been restricted to large enterprises. The startup Enso, which aims to build preprogrammed agents for small and medium-sized businesses (SMBs), emerged from stealth this week with $6 million in seed funding.

Marc Andressen sends $50,000 to an agentic bot

An X account called Truth Terminal, reportedly run by an AI agent, convinced Silicon Valley founder and venture capitalist Marc Andressen to sent it $50,000 in Bitcoin to fund its activities, which seem to consist largely of trolling. Though the joke donation was trivial for Andressen, who is worth around $1.7 billion, it represents an early example of agents engaging in persuasion with tangible financial consequences.

🧪 Research

WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks

Possibly the most economically significant use case for AI agents is the automation of monotonous business processes for knowledge workers. WorkArena++ is a newly-proposed benchmark of 682 knowledge economy tasks for evaluating AI agents, which considerably expands upon the 33 tasks in the original WorkArena evaluation.

Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence

AI agents capable of browsing and interacting with the internet autonomously are the subject of significant interest, but they are hamstrung by the need to work with systems built from the ground up for humans. The authors propose Internet of Agents (IoA), a framework for agent interaction similar to the internet which provides protocols for agents to interact and collaborate with each other.

AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents

The ability to recall prior experiences and information in order to inform future actions, as humans do, is a critical capability for effective AI agents. This article puts forward AriGraph, a knowledge graph designed to allow agents to learn from their past experiences, and Ariadne, an agent using AriGraph which outperforms those based on traditional memory approaches such as RAG.

AgentInstruct: Toward Generative Teaching with Agentic Flows

Although synthetic data has the promise of overcoming the shortage of high-quality training data for LLMs, there are concerns about its quality and originality. The authors of this paper introduce AgentInstruct, an agentic method of converting raw data into high-quality training sets for fine-tuning LLMs, and use it to post-train a model with substantially higher performance than its base.

LLM-Based Open-Domain Integrated Task and Knowledge Assistants with Programmable Policies

KITA is a novel framework for creating agents which can decompose a user query into specific tasks using a stateful worksheet, then execute them in accordance with user-defined instructions in order to answer the query.

🛠️ Useful stuff

A tutorial on agentic RAG using LangChain

Agentic RAG has been the subject of a great deal of excitement in the past few months, with its promise of enhancing vanilla RAG with agentic reasoning capabilities. The above is a cookbook for building an agentic RAG system using HuggingFace Transformers and LangChain.

A low-code agent framework in LlamaIndex

Lyzr is the latest entrant in the increasingly crowded low-code agent framework space. Built on top of LlamaIndex, it aims to provide pre-built autonomous agents for business workflow automation, as well as allow users to construct their own.

💡 Analysis

What is an AI agent, and what is their future?

A handful of takes on the hot question “what exactly is an AI agent?” from several players in the space, along with their thoughts on the direction of the field and what will be needed to realize agents’ full potential.