Artistic interpretation of how microbial genome sequences from the GEM catalog can help fill in gaps of knowledge about the microbes that play key roles in the Earth’s microbiomes. (Credit: Zosia Rostomian​/Berkeley Lab)

Despite advances in sequencing technologies and computational methods in the past decade, researchers have uncovered genomes for just a small fraction of Earth’s microbial diversity. Because most microbes cannot be cultivated under laboratory conditions, their genomes can’t be sequenced using traditional approaches. Identifying and characterizing the planet’s microbial diversity is key to understanding the roles of microorganisms in regulating nutrient cycles, as well as gaining insights into potential applications they may have in a wide range of research fields.

Now, thanks to a massive project involving more than 200 scientists from the DOE Joint Genome Institute and DOE Systems Biology Knowledgebase (KBase), a repository of 52,515 microbial draft genomes generated from environmental samples around the world has been made public. This new resource, known as the Genomes from Earth’s Microbiomes (GEM) catalog, expands the known diversity of bacteria and archaea by 44%.

“Using a technique called metagenome binning, we were able to reconstruct thousands of metagenome-assembled genomes (MAGs) directly from sequenced environmental samples without needing to cultivate the microbes in the lab,” said Stephen Nayfach, a JGI scientist and first author of the study describing the GEM catalog published today in Nature Biotechnology. “What makes this study really stand out from previous efforts is the remarkable environmental diversity of the samples we analyzed,” he added.

Metagenomics is the study of the microbial communities (microbiomes) in the environmental samples without needing to isolate individual organisms, using various methods for processing, sequencing and analysis.

Much of the data in the catalog had been generated from environmental samples sequenced by the JGI through the Community Science Program and was already available on the JGI’s Integrated Microbial Genomes & Microbiomes (IMG/M) platform. But the team behind GEM wanted to make this data more organized and accessible to the international microbial community.

The large team worked to sort and label the vast pool of metagenomic data so that users could search the resulting catalog for features of interest – such as the presence of genes needed to produce interesting compounds – and predict how these unculturable microbes interact with the environment and other organisms.

“With this dataset I can see where every microbe is found, and how abundant it is. This is a great resource for the community that is going to facilitate many more studies,” said co-author Kostas Konstantinidis, who is already using the catalog in his own research on how microbes respond to climate change.