Friday, February 21, 2025

The AI ​​model will decipher the code in proteins, which tells them where to go

Share

Proteins are working works that keep our cells, and in our cells there are many thousands of types of proteins, each of which has a specialized function. Scientists have long known that the protein structure determines what it can do. Recently, researchers will appreciate that protein location is also crucial for its functions. The cells are full of compartments that assist organize many residents. Together with well -known organelles, which decorate the pages of biology textbooks, these spaces also include various animated, membrane compounds that focus some molecules to perform common functions. Knowledge about where the protein is located and located, it can therefore be useful for a better understanding that protein and its role in a robust or diseased cell, but scientists lacked a systematic way of predicting this information.

Meanwhile, the protein structure was studied for over half a century, culminating in the artificial intelligence tool of Alphafold, which can predict the protein structure from the protein amino acid code, the linear building of building blocks, which consists of its structure. Alphafold and models such as it became widely used tools in research.

Proteins also contain amino acid areas that do not make up a eternal structure, but are critical for the assist of proteins in a combination of animated compartments in the cell. Professor Mit Richard Adolescent and colleagues wondered if the code in these regions can be used to predict the location of protein in the same way as other regions are used to predict the structure. Other researchers have discovered some protein sequences that encode protein location, and some began to develop predictive models for protein location. However, scientists did not know if the location of the protein to any animated compartment can be predicted on the basis of its sequence, nor had a comparable tool for Alphafold to predict the location.

Now Adolescent, also a member of the Whitehead Institute for Biological Research; Adolescent Lab Postdoc Henry Kilgore; Regina Barzilay, engineering school outstanding professor for AI and health at Mit’s Computer Science and Artificial Intelligence Laboratory (CSAIL); And colleagues built a model that they call Protgps. In an published article February 6 in the journal With the first authors of Kilgore and graduates of the Barzila itamar Chinn, Peter Mikhael and Ilan Mitnikov, the interdisciplinary team debut in their model. Scientists show that ProtgP can predict which of 12 known types of intervals will be located protein, as well as whether the mutation associated with the disease will change this location. In addition, the research team has developed a generative algorithm that can design recent proteins for locations in specific compartments.

“I hope that this is the first step towards a powerful platform that allows people to test proteins to conduct their research,” says Adolescent, “and that it helps us understand how people develop in complex organisms, what they are, how they are Mutations interfere with these natural processes and the method of generating therapeutic hypotheses and drug design to treat dysfunction in the cell. “

Scientists also approved many model forecasts using experimental tests in cells.

“I really excited me that I could go from computing design to trying these things in the laboratory,” says Barzilay. “There are many exciting articles in this field of artificial intelligence, but 99.9 percent of them are never tested in real systems. Thanks to our cooperation with the young laboratory, we could test and really find out how our algorithm is doing well. “

Development of the model

Scientists trained and tested Protgps on two parts of proteins with known locations. They discovered that this could correctly predict where proteins end in high accuracy. Researchers also checked how ProtgP can predict changes in protein location based on mutations related to the disease in protein. It was found that many mutations – changes in the sequence of the gene and its appropriate protein contribute to the disease or cause disease based on associative studies, but the ways in which the mutations lead to the symptoms of the disease remain unknown.

Considering the mechanism how the mutation contributes to the disease is critical, because then scientists can develop therapies to determine this mechanism, prevent or treat the disease. Adolescent and colleagues suspected that many disease -related mutations could contribute to the disease by changing the location of protein. For example, a mutation can make protein unable to join the compartment containing the necessary partners.

They tested this hypothesis, feeding over 200,000 proteins with mutations related to the disease, and then asking it to predict where these mutated proteins will locate and measures how much its prediction changed for a given protein from normal to a mutated version. A immense change in the forecast indicates a probable change in location.

Scientists have discovered many cases in which the disease associated with the disease seemed to change the location of the protein. They tested 20 examples in cells, using fluorescence to compare, where the cell has normal protein and its mutated version. The experiments were confirmed by Protgps forecasts. All in all, discoveries confirm the suspicion of scientists that the wrong location can be an underestimated mechanism of the disease, and show the value of Protgps as a tool for understanding the disease and identifying recent therapeutic possibilities.

“The cell is such a complicated system, with so many components and complex interaction networks,” says Mitnikov. “It is very interesting to think that thanks to this approach we can disturb the system, see the result, and therefore drive the discovery of mechanisms in the cell, and even rely on this therapeutics.”

Scientists hope that others are starting to utilize ProtgP in the same way as they utilize predictive structural models such as Alphafold, developing various projects regarding the function of protein, dysfunction and diseases.

Going beyond the forecast to the novel generation

Scientists were excited about the possible applications of their forecasting model, but they also wanted their model to go beyond predicting the location of existing proteins and let them design completely recent proteins. The goal was to create completely recent amino acid sequences, which after creating in the cell would be located in the desired place. Generating a recent protein, which can actually perform a function – in this case the location function in a specific cellular range – is extremely challenging. To escalate the chances of their model’s success, scientists have constrained their algorithm to designing only proteins such as those found in nature. This is a common approach in the design of drugs for logical reasons; Nature was billions of years to find out which protein sequences work well and which do not.

Due to cooperation with the teenage laboratory, the machine learning team was able to check if their protein generator was working. The model had good results. In one round he generated 10 proteins intended for locating in the nucleus. When scientists tested these proteins in the cell, they discovered that four of them strongly located in the nucleus, and others could have little prejudices towards this place.

“Cooperation between our laboratories was so generative for us,” says Mikhael. “We learned how to speak of each other languages, in our case they learned a lot about how cells work, and the possibility of experimental testing of our model, we were able to find out what we need to do to actually do the model works and then improves him. “

The ability to generate functional proteins in this way can improve the ability of researchers to develop therapy. For example, if the drug must affect the goal that is located in a certain compartment, scientists can utilize this model to design the drug to locate there. This should make the medicine more effective and reduce side effects, because the drug will spend more time to get involved in the goal and less time to interact with other particles, causing results outside of goals.

Members of the machine learning team are delighted with the prospect of using what they have learned from this cooperation to design recent proteins with other functions except locations that would expand the possibilities of therapeutic design and other applications.

“Many articles show that they can design a protein that can be expressed in the cell, but not that the protein has a specific function,” says Chinn. “In fact, we had a functional protein design and a relatively huge success rate compared to other generative models. This is really exciting for us and something we would like to build on. “

All involved researchers perceive ProtgP as an electrifying start. They expect their tool to be used to learn more about the roles of location in the function of protein and incorrect location in the disease. In addition, they are interested in expanding the model location forecasts with more types of intervals, testing more therapeutic hypotheses and designing more and more functional therapy proteins or other applications.

“Now that we know that this protein code for location exists and that machine learning models can understand this code and even create functional proteins using its logic, which open the door to so many potential research and applications,” says Kilgore.

Latest Posts

More News