The new anthropic model stands out in reasoning and planning, and has Pokémon’s skills to show

Anthropic has announced two New models, Claude 4 Opus and Claude Sonnet 4, during his first conference of developers in San Francisco on Thursday. The couple will be immediately available to pay Claude Subscribers.

The new models, which jump the denomination convention of 3.7 to 4, have several strengths, including their ability to reason, plan and remember the context of conversations for periods of prolonged time, according to the company. Claude 4 opus is also even better to play Pokémon than his predecessor.

“He was able to work off Pokémon for 24 hours,” says Anthropic Product Manager Mike Krieger in an interview with Wired. Previously, the longest model could play was only 45 minutes, added a company spokesman.

A few months ago, Anthropic launched a Twitch flow called “Claude Plays Pokémon” showing Claude 3.7 Sonnet skills in Pokémon Red Live. The demonstration aims to show how Claude is able to analyze the game and make step by step decisions, with a minimum direction.

The leadership of Pokémon’s research is David Hershey, a member of Anthropic Technical Staff. In an interview with Wired, Hershey says he chose Pokémon Red because it is “a simple game plan”, that is, the game is based on real time reactions, with which the current anthropic models fight. It was also the first video game he ever played, at the original game Boy, after having achieved it for Christmas in 1997. “He has a very special place in my heart,” says Hershey.

Hershey’s general purpose with this research was to study how Claude could be used as an agent: to work independently to do complex tasks on behalf of a user. Although it is unclear what previous knowledge Claude about Pokémon of your training data, your system message is minimal design: You are Claude, you are playing Pokémon, here are the tools you have and you can press buttons on the screen.

“Over time, I have been going on and deleting all the specific things of Pokémon I can, just because I think it is really interesting to see how much the model can find out on its own,” says Hershey, who adds that he hopes to build a game that Claude has never seen before to try his limits.

When Claude 3.7 Sonnet played the game, he encountered some challenges: he spent “dozens of hours” stuck in a city and had trouble identifying non -players, who dramatically stormed his progress in the game. With Claude 4 Opus, Hershey noticed an improvement in Claude’s long -term memory and skills when he saw him navigate a complex Pokémon search. After realizing that he needed a certain power to move forward, Ai spent two days improving his skills before he continued to play. Hershey believes that this type of multistep reasoning, without immediate comments, shows a new level of coherence, that is, the model has a better ability to follow the path.

“This is one of my favorite ways to know a model. The same is how I understand what their strengths are, what their weaknesses are,” says Hershey. “It’s my way to come to take this new model that we are about to consider and how to work with him.”

Everyone wants an agent

Anthropic Pokémon’s research is a new approach to a pre -existing problem: How do we understand what decisions a and closer to complex tasks and put it in the right direction?

The answer to this question is integral to advance the agents of the AI affected in the industry, which can face complex tasks with relative independence. In Pokémon, it is important that the model does not lose the context or “forget” the task at your disposal. This also applies to AI agents as requested to automate a flow of work, even it takes hundreds of hours.

The new anthropic model stands out in reasoning and planning, and has Pokémon’s skills to show

Everyone wants an agent

Comments

Leave a Reply Cancel reply