Despite all the hype about generative AI turning the world upside down, the technology has yet to significantly change mental work. Employees are using chatbots for tasks like writing emails, and companies are conducting countless experiments, but office work has not had a major reboot by AI.
Perhaps this is just because we haven’t yet given chatbots like Google’s Gemini and OpenAI’s ChatGPT the right tools for the job; they are usually circumscribed to ingesting and spitting out text via a chat interface. Things could get more captivating in a business context when AI companies start deploying so-called “AI agents” that can take actions by operating other software on a computer or over the Internet.
AnthropicOpenAI competitor, announced is an essential novelty today that tries to prove the thesis that the exploit of tools is needed for another leap in the usefulness of artificial intelligence. The startup allows developers to direct the Claude chatbot to access external services and software to perform more useful tasks. For example, Claude can exploit a calculator to solve math problems that vex huge language models; be required to access a database containing customer information; or be forced to exploit other programs on the user’s computer when it would assist.
I’ve written before about how essential AI agents can be when they can take action, both in the quest to make AI more useful and in the quest to create more bright machines. Using Claude’s tools is a tiny step towards developing the more useful AI support tools that are now being released on the market.
Anthropic is working with several companies to assist them create Claude-based sidekicks for their employees. A company offering online tutoring Downloading researchfor example, it developed a way for Claude to exploit various features of its platform to modify the user interface and curriculum content displayed to students.
Other companies are also entering the AI stone age. Earlier this month, Google demonstrated a handful of prototype AI agents at its I/O developer conference, as well as a host of other recent AI gadgets. One agent was designed to handle returns from online purchases by searching for the receipt in your Gmail account, filling out a returns form, and arranging pickup of the package.
Google has yet to launch its feedback bot for mass exploit, and other companies are also treading cautiously. This is likely in part because getting AI agents to behave appropriately is complex. LLMs do not always correctly identify what is being asked of them and may make incorrect assumptions that break the chain of steps necessary to successfully complete the task.
Limiting early AI agents to a specific task or role in a company’s workflow can be a great way to make the technology useful. Just as physical robots are typically deployed in carefully controlled environments that minimize the risk of them breaking something, keeping AI agents on a brief leash can reduce the risk of mishaps.
Even these early exploit cases can prove extremely profitable. Some huge companies are already automating common office tasks using so-called robotic process automation (RPA). This often involves recording employee actions on a screen and dividing them into steps that can be repeated by software. AI agents built on LLM’s broad capabilities can allow for much more work to be automated. IDC, an analytical company, claims the South African market is already worth about $29 billion, but the AI injection is expected to more than double to about $65 billion by 2027.
