DeepMind is using AI to pinpoint the causes of genetic disease
With the rise of gene sequencing, doctors can now decode people’s genomes and then scour the DNA data for possible culprits. Sometimes, the cause is clear, like the mutation that leads to cystic fibrosis. But in about 25% of cases where extensive gene sequencing is done, scientists will find a suspicious DNA change whose effects aren’t fully understood, says Heidi Rehm, director of the clinical laboratory at the Broad Institute, in Cambridge, Massachusetts.
Scientists call these mystery mutations “variants of uncertain significance,” and they can appear even in exhaustively studied genes like BRCA1, a notorious hot spot of inherited cancer risk. “There is not a single gene out there that does not have them,” says Rehm.
DeepMind says AlphaMissense can help in the search for answers by using AI to predict which DNA changes are benign and which are “likely pathogenic.” The model joins previously released programs, such as one called PrimateAI, that make similar predictions.
“There has been a lot of work in this space already, and overall, the quality of these in silico predictors has gotten much better,” says Rehm. However, Rehm says computer predictions are only “one piece of evidence,” which on their own can’t convince her a DNA change is really making someone sick.
Typically, experts don’t declare a mutation pathogenic until they have real-world data from patients, evidence of inheritance patterns in families, and lab tests—information that’s shared through public websites of variants such as ClinVar.
“The models are improving, but none are perfect, and they still don’t get you to pathogenic or not,” says Rehm, who says she was “disappointed” that DeepMind seemed to exaggerate the medical certainty of its predictions by describing variants as benign or pathogenic.
Fine tuning
DeepMind says the new model is based on AlphaFold, the earlier model for predicting protein shapes. Even though AlphaMissense does something very different, says Pushmeet Kohli, a vice president of research at DeepMind, the software is somehow “leveraging the intuitions it gained” about biology from its previous task. Because it was based on AlphaFold, the new model requires relatively less computer time to run—and therefore less energy than if it had been built from scratch.
In technical terms, the model is pre-trained, but then adapted to a new task in an additional step called fine-tuning. For this reason, Patrick Malone, a doctor and biologist at KdT Ventures, believes that AlphaMissense is “an example of one of the most important recent methodological developments in AI.”