As in real estate, a key factor in gene expression and interaction is location, location, location. The same genes in different regions of a body may perform totally different functions. Determining the fate of these genes involves systematically mapping and detecting their spatial expression patterns, an unwieldy challenge because of the reams of data researchers must process.
It is a problem that researchers at the U.S. Department of Energy’s Lawrence Berkeley National Laboratory and UC Berkeley are tackling with a new statistical method for extracting meaningful information from spatial gene expression data.
“We’re investigating the process of development, of how the undifferentiated fertilized egg grows into a fully formed organism such as a fly or human being,” said Erwin Frise, a Berkeley Lab scientist in the Division of Environmental Genomics and Systems Biology.
The challenge is born out of the success of genome sequencing – and the large volumes of RNA data subsequently generated – in recent decades. Frise and Berkeley Lab senior scientist Susan Celniker, in particular, have been working with the fruit fly genome as core members of the Berkeley Drosophila Genome Project, which Celniker co-directs.
Since the successful sequencing of the Drosophila melanogaster genome in 2000, the researchers have amassed large-scale collection of gene expression data covering the stages of a fruit fly’s life, from egg to fully developed larvae.
“Our datasets are large and complex, requiring new methods to extract useful biological information about the regulatory networks that direct genes to differentiate into particular cells and tissue,” said Celniker. “We determined spatial gene expression patterns, gene by gene, and now the goal is to identify the organizing principles.”
The Berkeley Lab researchers teamed up with colleagues at UC Berkeley’s Department of Statistics to tackle this challenge. Bin Yu, a professor of statistics, and her graduate student, Siqi Wu, worked with Frise to develop a learning algorithm that partitions biological spatial data or images into building blocks the researchers called principal patterns.
“These principal patterns correspond to pre-organ regions, areas that will become the different parts of the body,” said Yu. “Genes work together to develop these pre-ordered regions.”
They applied this algorithm to data obtained from fruit fly embryos only 1 to 3 hours old. Notably, in the development of organisms, cell fates are determined before structural features are visible.
The algorithm detected signs of related gene functions in the data to create a topological map of genetic regions destined to become the brain, midgut, mesoderm and more. The map contains 21 different principal patterns. They confirmed the results by matching them with a traditional fate map.
The method, described in a recent study, could go far in interrogating human tissue organization and helping elucidate key aspects of development, human health and disease.
The National Institutes of Health, Science Foundation, Army Research Office, and the Air Force Office of Scientific Research helped support this work.
Lawrence Berkeley National Laboratory addresses the world’s most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab’s scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the U.S. Department of Energy’s Office of Science. For more, visit www.lbl.gov.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit the Office of Science website at science.energy.gov/.