Anthropic announced two Recent models, Claude 4 Opus and Claude Sonnet 4, during the first conference of programmers in San Francisco on Thursday. Claude 4 Opus will immediately be available to Claude subscribers, and Claude Sonnet 4 will be available to free and paid users.
The recent models that jump from the convention of names from 3.7 straight to 4 have a number of strengths, including their ability to reason, plan and remember the context of conversations in the long run, says the company. Claude 4 Opus is also even better in playing in Pokã © Mon than its predecessor.
“He was able to act aggressively on the POZ © MON for 24 hours,” says that the product director Anthropic Mike Krieger in an interview with Wired. Earlier, the longest model was only 45 minutes, added the company spokesman.
A few months ago, Anthropic launched a Twitch stream called “Claude Pokés Mon”, which presents the ability of Sonet Claude 3.7 in Poké Mon Red Live. The demo is to show how Claude is able to analyze the game and make decisions step by step, with a minimal direction.
The main main study of Pokés is David Hershey, a member of technical staff at Anthropic. In an interview with Wired Hershey, he says he chose Pokés Mon Red because it is a “simple playground”, which means that the game is based on a turn and does not require real -time reaction with which current anthropic models are struggling. It was also the first video game he has ever played on the original Game Boy after he won it at Christmas in 1997. “It has a fairly special place in my heart,” says Hershey.
Hershey’s primary goal in these studies was to examine how Claude can be used as an agent – working independently to perform intricate tasks on behalf of the user. Although it is not clear that Claude’s knowledge of the POK © mon from training data has before, its system is minimal according to the design: you are claude, you. on the screen.
“Time, passing and removed all the things specific to the POZ © Mon, which I can, just because I think that it is really interesting how much the model herself can come up with,” says Hershey, adding that she hopes to build a game that he has never seen to really test her border.
When Claude 3.7 Sonnet played the game, he encountered several challenges: he spent “he spent”Tens of hours“I got stuck in one city and had a problem with identifying not -good characters, which drastically stopped his progress in the game. Thanks to Claude 4 Opus Hershey, he noticed the improvement of Claude’s long -term memory and planning when he watched him move on the complex task of Pokéon. After realizing that he needs some power to go, and he spent two days on the improvement Skills, before she continued the game.
“This is one of my favorite ways to get to know the model. In this way I understand what his strengths are, what are his weaknesses,” says Hershey. “It’s my way to just deal with this recent model that we intend to publish and how to work with him” –
Everyone wants an agent
Anthropic’s Pokés Research is an innovative approach to solving an existing problem – how do we understand what decisions AI make when approaching complex tasks and attach them in the right direction?
The answer to this question is integral in dealing with very riots of AI-AI agents that can solve complex tasks with relative independence. In Pokés Mon, it is important that it modeling to automate the work, and even “takes a hundred that sets require that it requires hundreds.