After a long week of coding, you might think that San Francisco builders would retreat to the mountains, beaches, or the Bay Area’s lively club scene. But in reality, as the week winds down, AI hackathons begin.
Over the past few years, San Francisco has exploded with AI hackathons. Every Saturday or Sunday, technologists give talks on the latest advances in AI, networking, and—most importantly—turn ideas into working demos. Sometimes hackathons offer cash or cloud credits as prizes, but the real winners walk away with a sense of a startup.
“There is no better place in the world to realize the most ambitious project of your life than San Francisco,” he says. AI Agency co-founder Alex Reibman. “You often see a lot of competitions — like hackathons — but they’re not competitive. They’re as much collaboration as they are competition.”
At a hackathon in San Francisco last summer, Reibman decided to try his hand at building AI agents that could crawl the web. Agents are a scorching topic in Silicon Valley as the AI boom reaches its peak. The term is not precisely defined, but it broadly describes AI bots that can perform tasks automatically using interfaces and services that weren’t originally designed for automation—a kind of replacement for mundane tasks that once required human intervention.
But Reibman immediately ran into a problem. “They sucked,” Reibman said in an interview. “The agents failed 30 to 40 percent of the time, and often in unexpected ways.”
To fix this, Reibman’s team built internal debugging tools to see where their agents were going wrong. They eventually managed to get the agents to work a little better, but the debugging tools themselves ultimately stole the show and won the hackathon.
“I started showing the tools at a lot of hackathons and events in San Francisco, and people started asking for access to them,” Reibman said. “That was basically the confirmation I needed: instead of building an agent ourselves, we should build tools that make it easier to build agents.”
So Reibman founded Agency with his co-founders Adam Silverman and Shawn Qiu, offering tools to observe what AI agents are actually doing and catch where they’re going wrong. A year later, those tools eventually became Agency’s core product, the AgentOps platform that thousands of teams exploit each month, Reibman tells TechCrunch. The startup has already raised $2.6 million in pre-seed funding, led by 645 Ventures and Afore Capital.
COO Adam Silverman tells TechCrunch that AgentOps is like “multiple device management for agents,” analyzing all agent actions to ensure they don’t go down a rogue path.
“You want to understand whether your agent is going to act dishonestly and determine what limitations you can put in place,” Silverman said in an interview. “A lot of the work is being able to visually see where your guardrails are and whether the agent is abiding by them before you put them into production.”
The startup is partnering with Cohere and Mistral, AI modelers who also offer agent creation services, so customers can exploit the AgentOps dashboard to see how agents interact with the world and how much each one costs. Agency is model-agnostic, meaning it works with several different AI agent frameworks, but it integrates with popular tools like AutoGen, crewAI, and Microsoft’s AutoGPT.
In addition to the AgentOps dashboard, Agency also offers consulting services (Reibman previously worked at consulting firm EY) to aid companies get started building agents. The agency wouldn’t disclose any clients by name, but said hedge funds, consultants, and marketing firms exploit its tools.
For example, Reibman says Agency helped create an AI agent that writes blog posts about the companies a client does business with. Now, that same client uses the AgentOps dashboard to track agent performance and costs.
Huge players like OpenAI and Google are likely to ramp up their agent products in the coming months, and AI startups like Agency need to find a way to work with these advances, not against them.
“There are so many layers in the stack that it’s unlikely that an LLM vendor would try to cover all of them,” Reibman said. “OpenAI and Anthropic are building tools to create agents, but there are a lot of layers around them that make sure you have a production-ready code base.”