And even if it is right, the AI agent cannot complete the information that the knowledge doctors provide by gaining thanks to experience, says fertility doctor Jaime Knopman. When patients from her clinic in Manhattan Manhattan bring their information from AI chatbots, this is not necessarily incorrect, but what LLM suggests may not be the best approach in a specific case of the patient.
For example, taking into account IVF, couples will receive vitality assessments for their embryos. However, asking chatgpt to grant recommendations regarding the next steps based on the results themselves does not take into account other critical factors, says Knopman. “It is not just about the assessment: other things come into” – as in the case of the embryo biopsy, the state of the patient’s uterus lining and whether they have been successful in the past with fertility. In addition to the years of training and medical education, Knopman claims that “she took care of thousands of women.” This, he says, gives his real observations about what subsequent steps to realize, what LLM is missing.
Knopman says that other patients will come to how they want to transfer the embryo, based on the answer they received from AI. However, although the method that has been suggested may be common, other action courses may be more suitable for specific circumstances of the patient, he says. “There is a science we study and learn how to do it, but there is art, why one modality or treatment protocol is better for a patient than a different one,” he says.
Some companies behind these chatbots AI built tools for solving problems related to issued medical information. Openai, CHATGPT parent company, announced On May 12, he introduced Healthbench, a system designed to measure AI’s capabilities in answering health questions. Opeli claims that the program has been built with the assist of over 260 doctors in 60 countries and includes 5000 simulated health conversations between users and AI models, with a guide of scoring by doctors to assess the answer. The company claims that it stated that thanks to earlier versions of AI models, doctors can improve the answers generated by chatbot, but they claim that the latest models, available from April 2025, such as GPT-4.1, were as good or better than doctors.
“Our findings show that large language models have improved significantly over time and already outweigh experts in writing responses to examples tested at our reference point,” says Open AI on its website. “However, even the most advanced systems still have a significant space to improve, especially in search of the necessary context for unspecified queries and the worst reliability.”
Other companies build health -specific tools that are specially designed for doctors. Microsoft claims that he has created a novel AI-Zwana Mai Diagnostic Orchestra (Mai-DXO)-which four times diagnosed patients four times like doctors. The system acts on the question about several leading models of immense languages - including the OPENAI GPT, Google’s Gemini, Anthropic’s Claude, Meta Llama and GROK XAI – in a way that loosely imitates many human cooperating experts.
Novel doctors will have to learn to operate these AI tools and advisory patients who operate them, says Bernard S. Chang, a dean of medical education at Harvard Medical School. That is why his university was one of the first to offer students classes About how to operate technology in your practices. “This is one of the most exciting things that is happening now in medical education,” says Chang.
The situation resembles Chang when people began to turn to the Internet in search of medical information 20 years ago. Patients came to him and said: “I hope you are not one of the doctors who use Google.” But when the search engine became ubiquitous, he wanted to answer these patients: “You would not like to go to the doctor who did not do it.” He sees the same thing that is happening with AI now. “What doctor practices medicine and does not use this powerful tool?”
