How did agent startup Casetext sell for $650 million?

A case study in building an AI agent company, realtime voice agents are set to transform B2B and B2C applications, and more

In partnership with

Welcome back to Building AI Agents, your biweekly guide to everything new in the AI agent field!

With Building AI Agents’ readership growing rapidly, we’re excited to announce our first sponsor, Writer, a secure platform for integrating generative AI into your business processes. Their support enables us to continue bringing you the latest updates on the tremendously important field of AI agents.

In today’s issue…

  • How Casetext’s valuation grew 550% in two months

  • OpenAI catches—or caches—up to Anthropic

  • a16z on how voice agents will impact businesses

  • Red-teaming LLM applications with an agentic workflow

…and more

🔍 SPOTLIGHT

Source: Casetext

Betting the entire company on AI agents took one startup from a $100 million valuation to a $650 million exit in less than six months.

In an interview on Y Combinator’s Lightcone podcast, Casetext CEO Jake Heller details the startup’s decade of sluggish growth prior to the launch of GPT-4, as well as how the company transformed its business overnight by putting all of its chips on a new AI agent platform. Casetext was founded in 2013 to automate tedious legal tasks such as document review and grew at a slow but steady pace, achieving a $100 million valuation by 2023. Nevertheless, the company had not achieved true product market fit, with the existing natural language processing tools being insufficient to its ambitious mission. When GPT-3.5 was released, Casetext reviewed its capabilities and found them lacking, crippled by hallucinations and reasoning errors.

However, the arrival of GPT-4 in March 2023 changed everything. Heller, given access to a pre-release demo, immediately recognized it as a quantum leap over 3.5, coded up a prototype of a new legal agent himself, and insisted that the company shift the efforts of all 120 of its employees to developing it. The resulting product, CoCounsel, assists lawyers by breaking down legal tasks into micro steps, each with its own carefully crafted prompt. Despite the difficulties of working around LLMs’ known unreliability and the high stakes of failure given the domain, an extremely rigorous process of test-driven development allowed the company to create a product dependable enough for lawyers to use.

Just two months after GPT-4’s release, Casetext began negotiations with Thompson Reuters for an acquisition, which was consummated in August 2023 at a valuation of $650 million—a 6.5-fold increase in less than half a year. Casetext represents the most successful example (so far) of a new type of startup: an agent company devoted to automating a specific vertical. Recent months have seen a surge of launches of such startups, with dozens in the latest Y Combinator batch alone. Together, they constitute the first wave of a deluge that promises to profoundly upend virtually every sector of the economy.

If you find Building AI Agents valuable, forward this email to a friend or colleague!

Writer RAG tool: build production-ready RAG apps in minutes

RAG in just a few lines of code? We’ve launched a predefined RAG tool on our developer platform, making it easy to bring your data into a Knowledge Graph and interact with it with AI. With a single API call, writer LLMs will intelligently call the RAG tool to chat with your data.

Integrated into Writer’s full-stack platform, it eliminates the need for complex vendor RAG setups, making it quick to build scalable, highly accurate AI workflows just by passing a graph ID of your data as a parameter to your RAG tool.

📰 NEWS

Source: Wikipedia

OpenAI’s API now performs prompt caching by default, allowing users to save on the cost of input tokens which have already been seen. Prompt caching—which OpenAI’s rival Anthropic released in August—can lead to significant monetary and latency savings on agentic workflows, which often require inputting the same tokens repeatedly.

AI agent SDK provider Lyzr released its low-code Agent API Studio at PyCon 2024, allowing users to easily build and deploy agents.

🛠️ USEFUL STUFF

Source: Wikimedia Commons

A tutorial by LangChain demonstrating how to use OpenAI’s new Realtime API and the LangChain framework to build a ReAct voice agent for realtime conversation.

AgentTorch is a new framework by MIT Media Lab to simulate the dynamics of human populations, such as COVID isolation behavior, by using millions of LLM agents.

A framework by Google DeepMind similar to AgentTorch but intended for small-scale, more detailed social simulations.

💡 ANALYSIS

Source: Andressen Horowitz

A comprehensive overview of the emerging AI voice agents space, split into business-to-business (B2B) and business-to-consumer (B2C) categories, with a market map and unique thesis for each.

A report by Georgetown’s Center for Security and Emerging Technology headlined by former OpenAI board member Helen Toner detailing the potential challenges of a new world full of AI agents, as well as some potential interventions.

In his appearance on the All In podcast, Mark Cuban theorized that AI agents will become a feature rather than a product, as they will soon be able to instantiate their own agents to address customer challenges.

🧪 RESEARCH

Source: ArXiv

GOAT is an agentic system designed to automate red-teaming of LLM applications by reasoning through the creation of effective adversarial prompts to hack the target application.

The authors of this paper introduce AgentPrune, which enhances multi-agent systems’ cost effectiveness and safety by removing redundant or potentially malicious information from messages between the agents, and can integrate into existing multi-agent systems.

Thanks for reading! Until next time, keep learning and building!

If you have any specific feedback, just reply to this email—we’d love to hear from you

Follow us on X (Twitter), LinkedIn, and Instagram