Despite all the storm about generative artificial intelligence turning the world upside down, the technology has yet to significantly transform white-collar work. Workers are toying with chatbots for tasks like composing emails, and companies are launching countless experiments, but office work hasn’t seen a major AI reboot.
Maybe that’s just because we haven’t given chatbots like Google’s Gemini and OpenAI’s ChatGPT the right tools yet; they are generally restricted to receiving and spitting out text through a chat interface. Things may get more interesting in business environments as AI companies begin to deploy so-called “AI agents,” which can act by using other software on a computer or through internet
Anthropic, an OpenAI competitor, today announced a major new product that attempts to prove the thesis that tooling is necessary for the next leap in AI utility. Deployment allows developers to direct their Claude chatbot to access external services and software to perform more useful tasks. Claude can, for example, use a calculator to solve the kinds of mathematical problems that trouble large language models; being required to access a database containing customer information; or be forced to use other programs on a user’s computer when it helps.
I’ve written before about how important actionable AI agents can be, both in terms of the drive to make AI more useful and the quest to create smarter machines. Using Claude’s tool is a small step towards the goal of developing these more useful AI helpers that are being released into the world right now.
Anthropic has been working with several companies to help them build Claude-based assistants for their workers. Online tutoring company Study Fetch, for example, has developed a way for Claude to use different features of its platform to modify the user interface and curriculum content displayed to the student.
Other companies are also entering the stone age of AI. Google demonstrated a handful of prototype AI agents at its I/O developer conference earlier this month, among many other new AI doodads. One of the agents was designed to handle returns for online purchases, looking up the receipt in a person’s Gmail account, filling out the return form and scheduling package pickup.
Google has yet to release its return bot for mass use, and other companies are also moving cautiously. This is probably partly because it’s tricky to get AI agents to behave. LLMs do not always correctly identify what they are being asked to achieve and may make incorrect assumptions that break the chain of steps required to successfully complete a task.