Home AI News Enformer: Advancing Genetic Research with Transformer Architecture for Predicting Gene Expression

Enformer: Advancing Genetic Research with Transformer Architecture for Predicting Gene Expression

Enformer: Advancing Genetic Research with Transformer Architecture for Predicting Gene Expression

Enformer: Advancing Genetic Research with AI

Transformers have revolutionized genetic research by enhancing our ability to predict how DNA sequences impact gene expression. This groundbreaking technology is known as Enformer architecture.

Understanding the Human Genome

After successfully mapping the DNA sequence of the human genome through the Human Genome Project, the scientific community became optimistic about unraveling the genetic instructions that govern human health and development. DNA carries vital genetic information that determines various characteristics, including eye color and susceptibility to diseases. Genes, which make up only 2% of the genome, contain instructions about the amino acid sequence of proteins. The remaining 98% of the genome, known as non-coding sections, contains instructions about when and where genes should be produced in the human body. At DeepMind, we believe that AI can accelerate scientific progress in understanding these complex domains and their impact on human health.

Introducing Enformer

In collaboration with our Alphabet colleagues at Calico, we have introduced Enformer, a neural network architecture that greatly improves the accuracy of gene expression prediction from DNA sequences. Our research, published in Nature Methods, offers a new approach to studying gene regulation and causal factors in diseases. To facilitate further investigation, we have made our Enformer model and its initial predictions of common genetic variants openly available on GitHub.

Revolutionizing Gene Expression Prediction

Previous studies on gene expression relied on convolutional neural networks as building blocks. However, these models struggled to accurately capture the influence of distal enhancers on gene expression. Inspired by the limitations of existing models, we developed a new architecture based on Transformer technology commonly used in natural language processing. This adaptation allows us to process extended DNA sequences and model the impact of regulatory elements on gene expression at distances more than 5 times greater than previous methods.

Understanding Enformer’s Predictions

Enformer’s accuracy in predicting gene expression is attributed to its ability to interpret the DNA sequence and identify influential parts. Enhancers located more than 50,000 base pairs away from the gene significantly contribute to the predictions. Enformer also identifies insulator elements that separate independently regulated DNA regions.

Applications in Genetic Research

Enformer’s primary application lies in predicting how changes in DNA letters, known as genetic variants, affect gene expression. It outperforms previous models in accurately predicting the effects of these variants, whether they are natural or synthetic. This capability is particularly valuable in interpreting disease-associated variants identified through genome-wide association studies. By distinguishing true associations from false positives, Enformer aids in identifying genetic factors that contribute to complex diseases.

A Step Forward in Genomic Sequencing

While there are still many mysteries within the human genome, Enformer represents a significant advancement in understanding the complexity of genomic sequences. If you’re interested in leveraging AI to unravel the secrets of fundamental cell processes and contribute to genomics research, consider joining our team at DeepMind. We also welcome collaborations with other researchers and organizations striving to solve the unanswered questions in genomics.

Source link


Please enter your comment!
Please enter your name here