Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
More than 160,000 viruses previously unknown to scientists have been identified using a specialized artificial intelligence (AI) program.
The largest study of its kindhas highlighted the massive scale of the virosphere — the viruses that inhabit the many environments on Earth.
The researchers used an AI program called LucaProt, which identified previously unrecognized RNA viruses stored in databases of genetic material sourced from ecosystems around the world.
RNA viruses — including coronaviruses — have genetic material consisting of a single ribonucleic acid (RNA) strand, as opposed to double-stranded DNA in DNA viruses, like Herpes viruses.
The study shows how “transformational” AI has become for scientists “looking to identify protein structures and find divergent viruses,” said virologist Eddie Holmes, University of Sydney in Australia, who co-led the research.
The LucaProt AI algorthim used in this study works in a similar way to the AlphaFold system that was recognized in this year’s chemistry Nobel Prize. Work in AI was also recognized in the Nobel Prize for physics.
Holmes and his collaborators described the LucaProt system as cutting through the ‘dark matter’ of genetic material.
They began with ‘metagenomic’ samples of genetic material — a mess of information from plants, animals, fungi, bacteria, and ‘non-living’ material like viruses.
Within the known chunks of DNA were lines of unknown code: “Stuff that doesn’t match anything that we know of in our databases”, says Holmes.
The researchers called this ‘dark matter.’
The research trained an algorithm called LucaProt to predict which genetic information in the dark matter came from viral RNA species.
In a single 50-gram metagenomic sample taken from an agricultural station south of Sydney, Holmes said more than 1,600 new viruses were found.
In total, the team analyzed more than 10,000 similar metagenomic samples, which led to the discovery of 161,979 potential RNA virus species and 180 RNA virus supergroups.
But 160,000 viruses is a drop in the ocean of viruses yet to be found — probably less than 0.1%. The authors say this hints at the true scale of the world’s virosphere.
Ben Longdon, an evolutionary biology at the University of Exeter, UK, said LucaProt is an incredibly useful tool for identifying viruses, and is already using the tool to help in his research about emerging viral diseases.
LucaProt, Longon said, shows how AI is helping to discover “tons of things” about viruses, even “outpacing our ability to catalog them and name them.”
Does this study expose any new viral threats to humans? Probably not, says Holmes, as the viruses found in their study likely can’t infect humans.
“Of the 160,000 new ones, none of them are close to mammalian viruses, I don’t think any of these would ever infect a human,” said Holmes.
And even if they could infect humans, there’s no indication they’d be dangerous or disease-causing viruses. As with bacteria, ‘good’ or ‘friendly’ viruses can also be beneficial for health.
Despite the viruses being harmless, knowing they exist is vital, said Longdon.
“If we want to understand emerging infectious diseases, we need to understand what viruses are out there, how they’re shared and what factors determine their ability to jump between host species,” Longdon said.
Longon added the findings are one step towards understanding the diversity of viruses, and how they can evolve to become more or less infectious.
Edited by: Fred Schwaller
Primary Source:
Xin Hou, Yong He, Pan Fang et al. Using artificial intelligence to document the hidden virosphere (2024). Cell. http://dx.doi.org/10.1016/j.cell.2024.09.027