It used to be fairly effortless to tell the difference between human-made and AI-generated images — just two years ago, you couldn’t utilize image models to create a Mexican restaurant menu without coming up with fresh culinary delights like “enchuita,” “churiros,” “burrto,” and “margartas.”
Now, when I ask the fresh ChatGPT Images 2.0 model about a menu with Mexican dishes, it produces something that can be used immediately in the restaurant without customers noticing that something is wrong. (However, the ceviche priced at $13.50 might make me question the quality of the fish.)
For comparison, here’s the result I got with DALL-E 3 two years ago (ChatGPT didn’t generate images back then):

In the past, AI image generators have struggled with spelling because they typically utilize diffusion models, which work by reconstructing images from noise.
“Diffusion models […] they reconstruct the input data,” Asmelash Teka Hadgu, founder and CEO of Lesan AI, told TechCrunch in 2024. “We can assume that the captions in the image are a very, very small part, so the image generator learns patterns that cover most of those pixels.”
Since then, researchers have investigated other image generation mechanisms, e.g autoregressive modelswhich envision how the image should look and function more like an LLM.
Unfortunately, OpenAI declined to answer a press question this week about what model ChatGPT Images 2.0 supports.
Techcrunch event
San Francisco, California
|
October 13-15, 2026
However, the company explained that the fresh model has “thinking capabilities” that allow it to search the Internet, create multiple images on a single line, and double-check its creations – this allows Images 2.0 to create marketing assets in a variety of sizes, as well as multi-panel comics.
OpenAI also claims that Images has a better understanding of rendering non-Latin text in languages such as Japanese, Korean, Hindi and Bengali. The model’s knowledge ends in December 2025, which may impact the accuracy of the model in generating certain hints about the latest news.
“Images 2.0 delivers an unprecedented level of detail and fidelity in image creation. It can not only conceptualize more sophisticated images, but actually effectively brings that vision to life by being able to follow instructions, preserve the required detail, and render the fine-grained elements that often break image models: small text, iconography, UI elements, dense compositions, and subtle stylistic constraints, all at up to 2K resolution,” OpenAI announced in a press release.
These capabilities mean that generating an image isn’t as quick as typing a question into ChatGPT, but generating something intricate like a multi-panel comic still takes just a few minutes.
All ChatGPT and Codex users will be able to access Images 2.0 from Tuesday; Paid users will be able to generate more advanced results. The company will also create a gpt-image-2 file API availableand prices depend on the quality and resolution of the results.
