AlphaFold helps scientists discover how protein mutations cause diseases and how to prevent them
Luigi Vitagliano is Research Director at the Institute of Biostructures and Bioimaging in Naples, Italy. He shares his AlphaFold story.
Being a structural biologist in the AlphaFold era is like the early days of gold mining. Before this technology, everyone was laboriously finding single gold nuggets, cleaning them up and looking at them one by one. And then suddenly a gold mine appeared. We couldn’t believe our luck.
I have been studying the proteins encoded in our DNA for 30 years. There are between 20,000 and 100,000 different proteins in most human cells. In some cases, the way the chain of amino acids in a protein takes its shape, also known as “protein folding,” can be riddled with abnormalities that are linked to a range of diseases.
I’ve been looking recently at a family of human proteins, known as potassium channel tetramerization domain (KCTD) proteins, which are particularly poorly understood. What’s particularly fascinating about mutations in these proteins – caused by genetic mutations – is the range of diseases they’re associated with, from schizophrenia to autism, from leukemia to colon cancer, as well as brain and movement disorders.
Because recent proteins are constantly being made inside cells, aged or defective ones must be removed. There are 25 types of KCTD proteins in humans, and four-fifths of them seek out other proteins and mark them for degradation and destruction. This process is called ubiquitination, and it is crucial for keeping cells hearty and preventing disease.
When KCTD proteins don’t work properly, the consequences can be devastating to our health. But there’s a lot we don’t know about them. About one-fifth of the KCTD proteins inside cells were a mystery to scientists like me: we had no idea what they did, and therefore no idea how to prevent them from mutating and causing disease. Until now, we had very little structural information about them, which was a major barrier to KCTD research.
The structures predicted by AlphaFold revealed that throughout evolution, their structures had remained very similar despite having very different genetic codes. This was a significant breakthrough. Previously, we had relied on genetics to assess the similarities or differences between proteins. Based solely on genes, we had assumed that these proteins would be very different.
Using AlphaFold, we were able to build a recent evolutionary family tree based on the shape of these proteins, rather than their genetic sequence. Evolutionary trees are typically built using genetic information, but they don’t take structural similarities into account. Structure relates to function, so using this approach is stimulating—it could reveal all sorts of secrets about which KCTD proteins have similar functions and how those functions evolved over time.
I used AlphaFold to look at and compare the structures of all 25 KCTD proteins for similarities and differences, to identify which parts of these proteins are significant. To our delight, AlphaFold’s predicted structures turned out to be very right.
For example, we already knew that one section of the KCTD proteins—the BTB domain—was similar across all family members, so we assumed that was the most significant part. AlphaFold revealed many additional structural similarities between these proteins and opened up a whole recent area of exploration.
For 60 years—including the 30 years I’ve been working in this field—we’ve tried and failed to find the connection between sequences and structures. Generations of brilliant scientists have been unable to solve this problem. Then, almost miraculously, this solution came along. All of our data, the structural information for all of the KCTD family members, comes from AlphaFold. Without it, this study could not have been done at all.
I had a feeling that AlphaFold was a dream. If someone had told me that in two years we would have over 200 million protein structures, I would not have believed it. Now the coming decades are about discovering what exactly these proteins do. There is still much excitement and discovery ahead of us.