DeepMind's artificial intelligence made a database with 3D structures of almost every protein known to science

By: Michael Korgs | 29.07.2022, 16:43
DeepMind's artificial intelligence made a database with 3D structures of almost every protein known to science

Last year, Google's DeepMind released an open-source protein database containing 3D representations of hundreds of thousands of proteins, including all 20,000 known proteins in the human body. Now, the AlphaFold Protein Structure Database has been expanded to 200 million entries, including almost every protein discovered.

Proteins are the workhorses of living cells, performing a wide range of functions that are critical for survival. They're formed from chains of amino acids that fold into complicated three-dimensional formations, which govern their function. It's important to understand the structures of proteins in order to study how they function and how things may go wrong, which is crucial to research in areas such as new medicines and treatments, as well as crop and animal conservation.

However, calculating the structure of a protein based on its amino acids is difficult. Figuring out this sort of thing generally necessitates a significant amount of computing power and human labor hours, which has been referred to as the "protein folding problem." As a result, progress had been relatively slow throughout history.

Until Alphabet's powerful DeepMind AI was assigned to the issue. The system was originally trained on 100,000 known protein structures and was subsequently able to predict the structures of millions of other proteins with each one taking minutes or seconds rather than months or years to determine.

In July 2021, the AlphaFold Protein Structure Database became available to researchers. It originally included over 350,000 protein structures, including around 98.5 percent of human proteins as well as those found in fruit flies, mice, yeast and E. coli. It was subsequently expanded to include over a million protein structures from 10,000 species of animals, plants, bacteria, fungi and other organisms. Since then, over 500 000 scientists all around the world have used the database to assist their work.

DeepMind has just released an enormous new update to the database, which now includes roughly 214 million structures from a million species. That covers almost every protein known to science, providing a significant boost to disease treatment and vaccine development, as well as environmental sustainability and antibiotic resistance.

The whole database of protein structures can be downloaded from Google Cloud Public Datasets.

Source: newatlas.com