The knowledge bank shared by DeepMind is no small gift to the scientific community.
Following the news earlier this year that Google’s AI arm, DeepMind, had developed an AI that had cracked the protein-folding problem, DeepMind has made the 3D shapes of more than 350,000 proteins available to researchers around the world for free.
The protein-folding problem — the question of how a protein’s amino acid sequence dictates its three-dimensional atomic structure — has bedeviled researchers for decades, requiring painstaking work to solve, one protein at a time.
The knowledge bank shared by DeepMind is therefore no small gift to the scientific community. It includes the predicted shapes of almost all of the 20,000 proteins expressed by the human body, a huge tranche of information that doubles humanity’s understanding of the human proteome (a proteome is the complete set of proteins expressed by an organism).
Proteins, made up of long chains of amino acids, are one of the body’s building blocks and are essential to all life as we know it. There are millions of known protein sequences. The sequence is only a part of the picture, though; just as important is the shape. Attractions between the molecules cause the chains to twist themselves into incredibly complex three-dimensional shapes that effectively determine the protein’s function. For example, is the protein a hormone like insulin, with its shape determining how sugar levels are controlled in the blood? Or is it an antibody, whose shape determines how it fights a virus?
Researchers previously tackled the protein-folding problem by applying experimental methods such as nuclear magnetic resonance, cryo-electron microscopy, and X-ray crystallography. It was time-consuming, laborious work.
DeepMind’s AI model, called AlphaFold 2, predicts the 3D shape of a protein based only on its amino acid sequence, with accuracy comparable to experimental methods. It’s an attention-based neural network trained on the 170,000 protein shapes that have been revealed by experimental methods so far. Training used 16× Google TPU devices and took a few weeks. While that might seem like a lot, it’s in fact rather modest compared with some state-of-the-art models used today. And it’s not bad, considering the number of possible shapes any given amino acid chain can twist itself into is about a googol cubed (1 followed by 300 zeroes).
The now-public database of AlphaFold 2 predictions will help researchers analyze life at the atomic scale, increasing our understanding of how proteins function. The implications for medicine in particular are enormous: Imagine being able to discover how diseases function at a molecular level and design personalized medicines for patients. We are also now better placed to work on solutions for antibiotic resistance, create more nutritious crops, develop enzymes that can break down plastics, and tackle climate change.
DeepMind has said it will initially collaborate with researchers working on malaria, sleeping sickness, and leishmaniasis
(a parasitic disease), but it plans to expand the database to almost every protein known to science.
Not since the Human Genome Project have we learned so much about ourselves in such a short time, or unlocked such wealth of knowledge about life as we know it. AlphaFold 2 has already made a significant contribution toward our understanding of biology at a fundamental level and will accelerate research and enable large-scale bioinformatics for years to come.
To those who saw AI being taught to play games like Go and even video games like Starcraft II and thought it was pointless: Here’s the proof that it wasn’t. AI has a lot to teach us about ourselves and a lot to show us about the most fundamental aspects of life itself.
This article was originally published on EE Times Europe.
Sally Ward-Foxton covers AI technology and related issues for EETimes.com and all aspects of the European industry for EE Times Europe magazine. Sally has spent more than 15 years writing about the electronics industry from London, UK. She has written for Electronic Design, ECN, Electronic Specifier: Design, Components in Electronics, and many more. She holds a Masters’ degree in Electrical and Electronic Engineering from the University of Cambridge.