In biomedicine, segmentation involves annotating pixels from an critical structure in a medical image, such as an organ or cell. AI models can support doctors by highlighting pixels that may show signs of a specific disease or anomaly.
However, these models typically provide only one answer, while the problem of medical image segmentation is often far from black and white. Five human annotators can provide five different segmentations, perhaps disagreeing on the existence or extent of nodule boundaries on a lung CT image.
“Having options can help with decision-making. Even just seeing uncertainty in a medical image can influence someone’s decision, so it’s important to take that uncertainty into account,” says Marianne Rakic, a doctoral student in computer science at MIT.
Rakic is the main author paper along with others from MIT, the Broad Institute of MIT and Harvard, and Massachusetts General Hospital, is introducing a recent AI tool that can spot uncertainty in medical images.
Known as Tyche (named after the Greek god of chance), the system provides multiple plausible segmentations, each emphasizing slightly different areas of the medical image. The user can specify how many options Tyche outputs and choose the most appropriate one for his purpose.
Importantly, Tyche can take on recent segmentation tasks without having to retrain. Training is a data-intensive process that involves showing the model lots of examples and requires a lot of experience in machine learning.
Because it doesn’t require retraining, Tyche could be easier for clinicians and biomedical researchers to utilize than other methods. It can be used “out of the box” for a variety of tasks, from identifying lesions on a lung X-ray to localizing anomalies on a brain MRI.
Ultimately, this system could improve diagnostics or aid in biomedical research by highlighting potentially crucial information that other AI-based tools might miss.
“Ambiguity has been understudied. If your model completely misses a nodule that three experts say is present but two experts say is not, then you probably should pay attention to that,” adds senior author Adrian Dalca, an assistant professor at Harvard Medical School and MGH and a research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL).
Their co-authors are Hallee Wong, a graduate student in electrical engineering and computer science; Jose Javier Gonzalez Ortiz PhD ’23; Beth Cimini, associate director of bioimage analysis at the Broad Institute; and John Guttag, the Dugald C. Jackson Professor of Computer Science and Electrical Engineering. Rakic will present Tyche at the IEEE Conference on Computer Vision and Pattern Recognition, where Tyche has been selected as a keynote speaker.
Resolving ambiguity
AI systems for medical image segmentation typically utilize neural networks. Loosely based on the human brain, neural networks are machine learning models consisting of many connected layers of nodes, or neurons, that process data.
After talking to colleagues at the Broad Institute and MGH who utilize these systems, the researchers realized that two major issues were limiting their effectiveness: The models couldn’t capture uncertainty and had to be retrained for even slightly different segmentation tasks.
Some methods attempt to bypass one pitfall, Rakic says, but solving both problems with a single solution proves extremely hard.
“If you want to take ambiguity into account, you often have to use an extremely complicated model. Our goal with the method we propose is to make it easy to use with a relatively small model so that it can make predictions quickly,” he says.
Scientists built Tyche by modifying a elementary neural network architecture.
The user first gives Tyche some examples that demonstrate the segmentation task. For example, the examples could include several images of heart MRI lesions that have been segmented by different human experts so that the model can learn the task and see that there is ambiguity.
The researchers found that just 16 sample images, called the “context set,” are enough for the model to make good predictions, but there’s no limit to how many examples it can utilize. The context set allows Tyche to solve recent tasks without retraining.
To support Tyche capture uncertainty, the researchers modified the neural network to make multiple predictions based on a single medical image and a context set. They adjusted the network layers so that as the data moved from layer to layer, the candidate segmentations produced at each stage could “talk” to each other and to the examples in the context set.
This way, the model can ensure that candidate segmentations will be slightly different but still serve their purpose.
“It’s like rolling dice. If your model can roll a two, a three or a four, but doesn’t know you already have a two and a four, any of those can come back,” he says.
They also modified the training process to reward maximizing the quality of the best predictions.
If a user requests five predictions, at the end they may see all five medical image segments generated by Tyche, even if one may be better than the others.
The researchers also developed a version of Tyche that can be used with an existing, pre-trained model for medical image segmentation. In this case, Tyche allows the model to derive multiple candidates by performing tiny transformations on the images.
Better, faster predictions
When the researchers tested Tyche with datasets of annotated medical images, they found that its predictions took into account the diversity of human annotators, and its best predictions were better than any of the baseline models. Tyche also ran faster than most models.
“Putting multiple candidates out there and making sure they’re different from each other really gives you an advantage,” Rakic says.
The researchers also noted that Tyche could perform better than more complicated models trained using a immense, specialized dataset.
In future work, they plan to try using a more versatile context set, perhaps including text or multiple image types. They also want to explore methods that could improve Tyche’s worst-case predictions and improve the system to recommend the best candidates for segmentation.
This research is funded in part by the National Institutes of Health, the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard, and Quanta Computer.