In recent years, researchers have begun to explore the vast assemblage of microbes on and within the human body. These include protists, archaea, fungi, viruses and vast numbers of bacteria living in symbiotic ecosystems.
Known collectively as the human microbiome, these tiny entities influence an astonishing range of activities, from metabolism to behavior and play a central role in health and disease. Some 39 trillion non-human microbes flourish on and within us, in a ceaseless, interdependent bustle. Together, they make up over half of the human body’s cells, though they may possess 500 times as many genes as are found in human cells. Identifying and making sense of this microbial mixture has been a central challenge for researchers.
In a new study, Qiyun Zhu and his colleagues describe a new method for probing the microbiome in unprecedented detail. The technique provides greater simplicity and ease of use compared with existing approaches. Using the new technique, the researchers demonstrate an improved ability to pinpoint biologically relevant characteristics, including a subject’s age and sex based on microbiome samples.
The innovative research holds the promise of rapidly advancing investigations into the mysteries of the microbiome. With such knowledge, researchers hope to better understand how these microbes collectively act to safeguard human health and how their dysfunction can lead to a broad range of diseases. In time, drugs and other therapies may even be tailor-made based on a patient’s microbiomic profile.
Professor Zhu is a researcher in the Biodesign Center for Fundamental and Applied Microbiology and ASU’s School of Life Sciences. The research team includes collaborators from the University of California, San Diego, including co-corresponding author Rob Knight, Zhu’s former mentor.
The group’s research results appear in the current issue of the journal mSystems.
Tools of the trade
Two powerful technologies have been used to help researchers unlock the diversity and complexity of the microbiome, by sequencing the microbial DNA present in a sample. These are known as 16S and metagenomic sequencing. The technique described in the current study draws on the strengths of both methods to create a new way of processing data from the microbiome.
“We borrow some of the wisdom that developed from 16S RNA sequencing and apply it to metagenomics,” Zhu says. Unlike other sequencing methods, including 16S, metagenomics allows researchers to sequence all the DNA information present in a microbiome sample. But the new study shows that the metagenomic approach has room for improvement. “The way people currently analyze metagenomic data is limited, because whole genome data has to first be translated into taxonomy.”
The new technique, known as Operational Genomic Units (OGU) does away with the laborious and sometimes misleading practice of assigning taxonomic categories like genus and species to the multitude of microbes present in a sample. Instead, the method uses individual genomes as the basic units for statistical analysis and simply attempts to align sequences present in a sample to sequences found in existing genomic databases.
By doing this, researchers can get much more fine-grained resolution, which is particularly useful when microbes are present that are closely related in DNA sequence. This is true because most taxonomic classifications are based on sequence similarity. If two sequences differ by less than a certain threshold, they fall into the same taxonomic category, however the OGU approach can help researchers tell them apart.
Further, the method overcomes errors in taxonomy that persist as relics from the pre-sequencing epoch, when different species were defined by their morphology rather than from DNA sequence data.
In addition to improvements in resolution and simplicity, OGU can help researchers analyze data using what are known as phylogenetic trees. As the name implies, these are branching structures that can describe the degree of relatedness between organisms, based on their sequence similarity. Just as two distantly related species like worms and antelope will appear on more distant branches of a phylogenetic tree, so will more distantly related bacteria and other constituents of the microbiome.
Innovations in sequencing
The most widely used technique for probing the microbiome, known as 16S ribosomal RNA sequencing or just 16S, relies on a simple idea. All bacteria have a 16S gene, which is essential to the machinery bacteria need to initiate protein synthesis. The bacterial 16S gene, measuring 1500 base pairs in length, consists of distinct regions. Some of these regions change very little between different bacteria and over evolutionary timeframes, while others are highly variable.
Researchers realized that the conserved and variable regions of the 16S gene allow it to act as a molecular clock, keeping track of bacteria that are more closely or more distantly related, based on their sequence similarity. Thus, the 8 conserved and 9 variable regions of 16S can be used to fingerprint bacteria.
To do this, a microbiome sample is first collected. This could be a fecal sample, to evaluate the gut microbiome, or a sample from the skin or from the mouth. Each body site is home to a different bacterial menagerie.
Next, PCR technology is used to amplify portions of the 16S gene. By sequencing highly conserved regions, a broad swath of bacteria can be identified, while sequencing of variable regions helps narrow the identity of particular bacteria.
Although 16S is an inexpensive and well-developed method, it has limitations. The technique can only give a general idea of the kinds of bacteria present, with limited resolution. In general, 16S is only accurate to the genus level of identification.
Enter metagenomic sequencing. This technique sequences the full genomes of all microbes present in a microbiome sample, (not just bacteria, as with 16S). Metagenomics allows researchers to sequence thousands of organisms in parallel, providing accurate, species-level resolution. The greater resolution however does come with costs. Metagenomic data is far richer and more computationally challenging to analyze than 16S data and more expensive in time and money to process.
A new path for metagenomics
The OGU technique streamlines metagenomic sequencing, while providing even greater resolution. The approach classifies microbes in a sample strictly according to their alignment with a reference database — no taxonomic assignment required. The approach enables researchers to evaluate the degree of species diversity present in a sample.
Compared with 16S and standard metagenomic sequencing, the new approach is superior in ferreting out biologically relevant information. Using the classic Human Microbiome Project dataset of 210 metagenomes sampled from seven body sites of male and female human subjects, the study demonstrates better correlation between body site and host sex.
Next, 6,430 stool samples collected through a random sampling of the Finnish population were analyzed, using both 16S and metagenomic sequencing. The samples belong to a large, randomly sampled cohort of the Finnish population, known as FINRISK. The aim was to predict the age of sampled individuals, based on gut microbial composition. Again, the OGU method outperformed 16S and conventional metagenomic analysis, providing more accurate predictions.
New research drawing on still larger datasets will further enhance the resolution of the new technique and expand the descriptive power of taxonomy-independent analysis.
- Qiyun Zhu, Shi Huang, Antonio Gonzalez, Imran McGrath, Daniel McDonald, Niina Haiminen, George Armstrong, Yoshiki Vázquez-Baeza, Julian Yu, Justin Kuczynski, Gregory D. Sepich-Poore, Austin D. Swafford, Promi Das, Justin P. Shaffer, Franck Lejzerowicz, Pedro Belda-Ferre, Aki S. Havulinna, Guillaume Méric, Teemu Niiranen, Leo Lahti, Veikko Salomaa, Ho-Cheol Kim, Mohit Jain, Michael Inouye, Jack A. Gilbert, Rob Knight. Phylogeny-Aware Analysis of Metagenome Community Ecology Based on Matched Reference Genomes while Bypassing Taxonomy. mSystems, 2022; DOI: 10.1128/msystems.00167-22