Ductal carcinoma in situ (DCIS) is a type of pre-invasive cancer that sometimes develops into a highly fatal form of breast cancer. It accounts for about 25 percent of all breast cancer diagnoses.
Because it’s tough for doctors to determine the type and stage of DCIS, DCIS patients are often overtreated. To combat this, an interdisciplinary team of researchers from MIT and ETH Zurich have developed an AI model that can identify different stages of DCIS based on a low-cost and easy-to-obtain image of breast tissue. Their model shows that both the state and the arrangement of cells in a tissue sample are vital for determining the stage of DCIS.
Because such tissue images are so effortless to obtain, the researchers were able to build one of the largest data sets of their type, which they used to train and test their model. When they compared its predictions with the pathologist’s conclusions, they found clear agreement in many cases.
In the future, this model could serve as a tool for doctors to assist diagnose simpler cases without the need for laborious testing, giving them more time to evaluate cases in which it is less certain that DCIS will become invasive.
“We’ve taken the first step in understanding that we should look at the spatial organization of cells when diagnosing DCIS, and now we’ve developed a technique that’s scalable. So we really need a prospective study. Working with a hospital and bringing this to the clinic will be an important step forward,” says Caroline Uhler, a professor in the Department of Electrical Engineering and Computer Science (EECS) and the Institute for Data, Systems, and Society (IDSS), who is also director of the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard and a researcher in MIT’s Laboratory for Information and Decision Systems (LIDS).
Uhler, a co-corresponding author on the study, is joined by lead author Xinyi Zhang, a postgraduate student at EECS and the Eric and Wendy Schmidt Center; co-corresponding author G. V. Shivashankar, a professor of mechogenomics at ETH Zurich at the Paul Scherrer Institute; and others from MIT, ETH Zurich, and the University of Palermo in Italy. The open-access study was made available published on July 20 in .
Combining Imaging with Artificial Intelligence
Between 30% and 50% of DCIS patients develop highly invasive cancer, but scientists have no biomarkers that can tell doctors which cancers will develop.
Scientists can utilize techniques such as multiplex staining or single-cell RNA sequencing to determine the stage of DCIS in tissue samples. But these tests are too steep to be widely performed, Shivashankar explains.
In previous work, the researchers showed that a low-cost imaging technique known as chromatin staining can be as useful as much more steep single-cell RNA sequencing.
In the study, the researchers hypothesized that combining this single stain with a carefully designed machine learning model could provide the same information about cancer stage as more steep techniques.
First, they created a dataset of 560 images of tissue samples from 122 patients at three different stages of disease. They used this dataset to train an AI model that learns to represent the state of each cell in a tissue sample image, which it uses to infer the patient’s cancer stage.
But not every cell is a sign of cancer, so scientists had to combine them in a meaningful way.
They designed the model to create clusters of cells in similar states, identifying eight states that are vital markers of DCIS. Some cell states are more characteristic of invasive cancer than others. The model determines the proportion of cells in each state in a tissue sample.
Organization matters
“But in cancer, the organization of cells also changes. We found that just having the proportions of cells in each state is not enough. You also need to understand how the cells are organized,” Shivashankar says.
With this knowledge, they designed a model that took into account the proportions and distribution of cell states, which significantly increased its accuracy.
“It was interesting for us to see how spatial organization is so important. Previous studies have shown that cells that are close to the thoracic duct are important. But it’s also important to consider which cells are close to which other cells,” Zhang says.
When they compared their model’s results to samples reviewed by a pathologist, in many cases it showed clear agreement. In cases that weren’t so clear-cut, the model could provide information about features in the tissue sample, such as cell organization, that the pathologist could utilize to make decisions.
This versatile model could also be adapted to study other types of cancer and even neurodegenerative diseases that scientists are currently investigating.
“We have shown that with the right AI techniques, this simple stain can be very effective. There is still a lot of work to be done, but we need to take cell organization into account in more of our studies,” Uhler says.
This research was funded in part by the Eric and Wendy Schmidt Center at the Broad Institute, ETH Zurich, the Paul Scherrer Institute, the Swiss National Science Foundation, the US National Institutes of Health, the US Office of Naval Research, the MIT Jameel Clinic for Machine Learning and Health, the MIT-IBM Watson AI Lab, and by a Simons Investigator Award.