It has passed just over a week since Deepek raised the world of AI. Introducing your open-trained specialized computer systems, which energy leaders of the OpenAI waves. Employees not only claimed that they see tips that Deepseek had “improperly distillation” of OpenAI, but the success of the startup meant that Wall Street questioned, or companies such as Opeli are madness on calculations.
“Deepseek R1 is a moment of Sputnik AI,” wrote Marc Andreessen, one of the most influential and provoking inventors from the Silicon Valley, on x.
In response, OpenAI is preparing to launch a fresh model before the originally planned schedule. The model, O3-Mini, will debut in both API and chat. Sources say that it has O1 level reasoning at a level of 4o. In other words, it is rapid, inexpensive, smart and designed to crush Deepseek.
This moment was galvanized by OpenAI staff. In the company, there is a feeling that – especially when Deepseek dominates in conversation – OpenAI must become more proficient or risk removing behind the latest competitor.
Part of the problem results from the origin of Opeli as a non-profit research organization before it became a power seeking profit. Employees say that the ongoing struggle for power and product group resulted in a split between teams working on advanced reasoning and chat work. (Opeli spokesman, Niko Felix, says that this is “incorrect” and notes that the leaders of these teams, product director Kevin Weil and research director Mark Chen, “meet every week and work closely to adapt to adapt Priority and research priorities. “)
Some inside the OpenAi want the company to build a united chat product, one model that can determine if the question requires advanced reasoning. It hasn’t happened so far. Instead, the menu developed in ChatGPT encourages users to decide whether they want to apply GPT-4O (“great for most questions”) or O1 (“uses advanced reasoning”).
Some employees say that although the chat brings the lion’s share of OpenAI revenues, O1 attracts more attention – and calculating resources – from leadership. “Leadership does not care about chat,” says the former employee who worked (you guessed) chat. “Everyone wants to work on O1 because it is sexy, but the code base has not been built for experiments, so there is no shoot.” The former employee asked to maintain anonymity, citing a non -disclosure agreement.
Opeli spent his years experimenting with reinforcement learning to tune the model, which eventually became an advanced reasoning system called O1. (Strengthening learning is a process that trains AI models with a system of penalties and prizes.) Deepseek has developed strengthening learning, which OpenAI was a pioneer in order to create an advanced reasoning system called R1. “They used the knowledge that learning to strengthen, used in language models, works,” says a former Openai researcher who is not authorized to speak publicly about the company.
“Learning to strengthen [DeepSeek] Is it similar to what we did at OpenAI, “says another former Openai researcher,” but they did it with better data and a cleaner pile. “
Openai employees claim that the tests that were carried out in O1 were carried out at the code base, called the “Berry” stack, built for speed. “There were compromises-an emerental rigor of bandwidth,” says a former employee with direct knowledge of the situation.
These compromises made sense for O1, which was essentially a huge experiment, despite the restrictions of the basic code. They did not make so much sense in the case of chat, a product used by millions of users, which were built at another, more reliable pile. When O1 launched and became a product, cracks began to appear in the OpenAI internal processes. “It was like:” Why do we do it in an experimental code base, should we not do it in the main product test base? “, Explains the employee. “It was internal that it was sedate.”