Scientists can now access EVO 2 anywhere. This is a powerful new foundational model that understands the genetic codes of all areas of life. Today it was launched as the largest publicly available AI model for genomic data and was built on the NVIDIA DGX cloud platform in a collaboration led by the non-profit biomedical research institute Arc Institute and Stanford University.
EVO 2 is available to global developers of the Nvidia Bionemo platform. These include Nvidia nim Microservice for easy and secure AI deployments.
Trained on a vast dataset of approximately 9 trillion nucleotides, a DNA and RNA building block, EVO 2 is a novel molecule for genetic sequence-based protein morphology and function prediction, healthcare and industrial applications. can be applied to biomolecule research applications, such as identifying the and how gene mutations affect their function.
“EVO 2 represents a major milestone in generative genomics,” said Patrick HSU, co-founder and core investigator of the ARC Institute, Patrick HSU, professor of bioengineering at the University of California, Berkeley. It’s there. “By improving your understanding of these basic building blocks of life, we can pursue today’s unimaginable healthcare and environmental science solutions.”
NVIDIA NIM MicroService for EVO 2 allows users to generate a variety of biological sequences and have settings to adjust model parameters. Developers interested in fine-tuning EVO 2 with their own datasets can download models from the open source Nvidia Bionemo Framework, a collection of accelerated computing tools for biomolecule research.
“Designing new biology has traditionally been a laborious, unpredictable, craftsmanship process,” says Brian, assistant professor of chemical engineering at Stanford University, Dieter Schwarz Foundation Stanford Data Science Felire Fellow and ARC Institute Innovation Investigator. Hie said. “Using EVO 2 makes biological designs of complex systems more accessible to researchers, creating new and beneficial advances in just a few of the previous ones they’ve taken.”
It enables complex scientific research
Founded in 2021, $650 million from donors, the ARC Institute will help researchers tackle long-term scientific challenges by providing multi-year funding to scientists.
Its core investigators receive cutting-edge lab space and funding for eight years of renewable conditions that can be held at the same time as the appointment of a faculty with one of the university partners of Stanford University, University of California, Berkeley, Berkeley and Institute. Masu. University of California San Francisco.
By combining this unique research environment with NVIDIA’s accelerated computing expertise and resources, researchers at ARC Institute pursue more complex projects, analyse larger datasets, and achieve results more quickly. can. The scientist focuses on disease areas such as cancer, immune dysfunction and neurodegeneration.
Nvidia accelerated the EVO 2 project by allowing scientists to access 2,000 NVIDIA H100 GPUs via the Nvidia DGX cloud on AWS. DGX Cloud provides short-term access to large computing clusters and provides researchers with the flexibility to innovate. The fully managed AI platform includes Nvidia Nim Microservices and Nvidia Bionemo with software optimized in the form of Nvidia Bionemo blueprints.
Nvidia researchers and engineers also worked closely on scaling and optimizing AI.
Applications across biomolecule science
EVO 2 can provide insights into DNA, RNA and proteins. This model is trained on a wide variety of species across the realm of life, including plants, animals and bacteria. This model can be applied to science fields such as healthcare, agricultural biotechnology, and materials science.
EVO 2 uses a new model architecture that can process long sequences of genetic information up to 1 million tokens. This enlarged view into the genome can unlock scientists’ understanding of the relationship between distant parts of the organism’s genetic code and cellular function, gene expression, and mechanisms of disease.
“A single human gene contains thousands of nucleotides. Therefore, AI models can analyze how such complex biological systems work, and the maximum number of genetic sequences is the most common. “We need to process the parts at once,” HSU said.
In healthcare and drug discovery, EVO 2 helps researchers understand which gene variants are linked to a particular disease, and helps design new molecules that accurately target those regions. Masu. For example, researchers at Stanford University and the ARC Institute tested BRCA1, a breast cancer-related gene, with a 90% accuracy in whether previously unrecognized mutations affect gene function. I discovered something that can be predicted.
In agriculture, this model addresses global food shortages by providing insight into plant biology and helping scientists develop more climate-sensitive or nutritious crops. It’s helpful. Also, in other scientific fields, EVO 2 can be applied to design biofuels or engineer proteins that break down oils or plastics.
“Drawing a model like the EVO 2 is like sending a powerful new telescope into the furthest area of the universe,” said Dave Burke, ARC’s Chief Technology Officer. “I know there is a great opportunity to explore, but I don’t know what I will discover yet.”
Learn more about EVO 2 in the NVIDIA Technical Blog and ARC Technical Reports.
Please refer to the Software Product Information Notice.