UArizona-led CyVerse Part of $20M NSF Award to Catalyze Biosciences Discoveries
CyVerse, an open science platform led by the University of Arizona, will provide the computing infrastructure for a new center focused on molecular and cellular biosciences that has received a $20 million award from the U.S. National Science Foundation.
NSF’s funding will support the creation of the National Synthesis Center for Emergence in the Molecular and Cellular Sciences, or NCEMS, led by Penn State. The center will enable research that uses existing, publicly available data to glean new insights about how complex biological systems, such as cells, emerge from simpler molecules. Findings from the research could eventually inform the development of disease treatments and other applications such as minimizing the negative effects of aging.
As part of the $20 million, UArizona will receive $1.9 million as the lead institution for CyVerse. CyVerse the world’s largest publicly funded open-source cyberinfrastructure for life sciences. The platform supports researchers’ data and computing use and needs and connects collaborators to an even larger framework that is supported by NSF, enabling modern science and data-driven discovery, said research associate professor Tyson Swetnam, CyVerse co-lead and CyVerse lead on NCEMS.
“CyVerse is the plumbing for these projects that involve enormous data sets. It provides a trans-institutional, open-source framework that helps researchers collaborate on national-scale projects like the Synthesis Centers to solve the planet’s greatest challenges,” said Swetnam, who also directs the Open Science Initiative at Arizona’s Institute for Computation and Data-Enabled Insight. “Team science and collaboration would be beyond difficult without something like CyVerse. I’m excited to be a part of NCEMS.”
Advancing biology, biomedicine, renewable energy and more
Within NCEMS, CyVerse will offer computation, data storage, and training in advanced data science research techniques to help remove barriers to large-scale information synthesis. The center will also establish practices that prioritize data integration and reuse and promote a culture that values existing data as much as new data in primary research.
NCEMS will focus initially on emergent properties at the mesoscale, the scale that spans from molecules — such as enzymes, DNA and proteins — to organelles, such as mitochondria. The goal is to understand how changes at the mesoscale influence higher subcellular and cellular outcomes, such as the traits that make individuals unique and distinguishable from one another.
“Many of the grand challenges in biology, such has how living organisms function well in a dynamic environment or suffer dysfunction and disease, are rooted in the mesoscale," said Ed O’Brien, NCEMS director and Penn State professor of chemistry. “We have a unique opportunity to harness big data and gain a more complete, detailed view of the mosaic of molecular and cellular processes by bringing together diverse datasets and multidisciplinary teams of scientists from across the world.”
NCEMS will work to broaden participation in science, technology, engineering and mathematics, both to democratize access to research and provide training for careers rich in computational and data sciences. Initial partner institutions include Claflin University, Alcorn State University and Fayetteville State University—all minority serving institutions. An innovative, remote research experience program will support synthesis research by students nationwide.
"The amount of publicly available data at the molecular and cellular scale is extensive, with each individual resource being valuable. Bringing those data together with the tools to synthesize them ― as this center is planned to do ― will create a whole greater than the parts and will drive advances in biology, biomedicine, renewable energy and more," said NSF Deputy Assistant Director for Biological Sciences Simon Malcomber. "This is the first time we will bring this approach to the molecular and cellular sciences and bring NSF's long history of support for Synthesis Centers to bear on the field."
Supporting research worldwide
In addition to NCEMS, CyVerse provides cyberinfrastructure support for the NSF Directorate for Biological Sciences’ other recent $20 million Synthesis Center, the Environmental Data Science Innovation & Inclusion Lab.
Synthesis centers enable collaborative teams to explore large and complex data sets, perform innovative analysis and disseminate the outcomes of their findings, said Nirav Merchant, CyVerse lead investigator and director of UArizona’s Data Science Institute. In the current era of machine learning and artificial intelligence, NCEMS participants will require a data platform and cyberinfrastructure that can facilitate these data-driven explorations.
“The NCEMS collaboration will provide CyVerse with the opportunity to broaden its cyberinfrastructure capabilities,” Merchant said. “Being part of the two recent NSF Synthesis Centers affirms the collaborative role CyVerse and the University of Arizona play in providing key infrastructure that allows communities to effectively address complex questions.”
"The University of Arizona has a rich history in managing enormous amounts of data thanks to its leadership in astronomy, life sciences and other fields," said University of Arizona President Robert C. Robbins. "CyVerse has built on this record of excellence, and really advanced it for us to take on today's challenges. I am so pleased to see it as part of this important national collaboration."
With $117 million in NSF funding since its creation in 2008, CyVerse is the largest and longest running investment by NSF into cyberinfrastructure for the life sciences.
Led by UArizona in partnership with the Texas Advanced Computing Center and Cold Spring Harbor Laboratory in New York, CyVerse supports more than 137,000 researchers in 169 countries. The platform has appeared in more than 1,700 peer-reviewed publications, trained over 45,000 researchers and instructors and supports $255 million in additional research funding by NSF.