The Genomictraits pipeline was developed to map 16S rRNA data to already sequenced genomes and create so-called pseudo-metagenomes by sing their nucleotide sequences. Besides, genomic traits (here PFAM domains) can be compared between different samples provided by the user.
Molecular sequencing techniques help to understand microbial biodiversity with regard to species richness, assembly structure and function. In this context, available methods are barcoding, metabarcoding, genomics and metagenomics. The first two are restricted to taxonomic assignments, whilst genomics only refers to functional capabilities of a single organism. Metagenomics by contrast yields information about organismal and functional diversity of a community. However currently it is very demanding regarding labour and costs and thus not applicable to most laboratories. Here, we show in a proof-of-concept that computational approaches are able to retain functional information about microbial communities assessed through 16S rDNA (meta)barcoding by referring to reference genomes. We developed an automatic pipeline to show that such integration may infer preliminary or supplementary genomic content of a community. We applied it to two biological datasets and delineated significantly overrepresented protein families between communities.
Keller A, Horn H, Förster F, Schultz J. (2014) Computational integration of genomic traits into 16S rDNA microbiota sequencing studies. Gene. 549:1 186–191