Principles and Workflow of 16S/18S/ITS Amplicon Sequencing
Summary
16S/18S/ITS amplification sequencing uses the next/third generation sequencing platform and performs high throughput sequencing of PCR products from specific regions such as 16S rDNA/18S rDNA/ITS/ functional genes. It overcomes the disadvantage of some microorganisms that is difficult or impossible to culture, and obtains the information of microbial community structure, evolutionary relationships and microbial correlation with environment in environmental samples.- Author Name: Dianna Gellar
16S/18S/ITS amplification sequencing uses the next/third generation sequencing platform and performs high throughput sequencing of PCR products from specific regions such as 16S rDNA/18S rDNA/ITS/ functional genes. It overcomes the disadvantage of some microorganisms that is difficult or impossible to culture, and obtains the information of microbial community structure, evolutionary relationships and microbial correlation with environment in environmental samples.
What is 16S rDNA /18S rDNA/ITS?
- 16s rDNA:16S rDNA is a DNA sequence encoding small subunit rRNA of prokaryotes with a length of about 1542bp. With a moderate molecular size and low mutation rate, 16S rDNA is the most commonly used marker in the study of bacterial systematics. The 16S rDNA sequence consists of 9 variable regions and 10 conservative regions, the conserved region sequences reflect the genetic relationships between species, while the variable region sequences reflect the difference between species. 16S rDNA sequencing is mainly used to analyze the diversity of bacteria or archaea.
- 18S rDNA:18S rDNA is a DNA sequence encoding small subunit rRNA of eukaryotic ribosomes. Like 16S rDNA, 18S rDNA sequence also consists of conservative regions and variable regions (V1-V9, absence of V6). Among variable regions, V4 has the most complete database information and the best classification effect, it is the mostly used and the best choice for 18S rRNA gene analysis notes. 18S rDNA sequencing reflects the species differences among eukaryotic organisms in given samples.
- ITS:ITS (Internal Transcribed Spacer) is part of the non-transcriptional region of the fungal rRNA gene. The ITS sequences used for fungal identification usually include ITS1 and ITS2. Because in fungi, 5.8S, 18S, and 28S rRNA genes are highly conserved, whereas ITS can tolerate more mutations in the evolutionary process due to less natural selection pressure, and exhibits extremely wide sequence polymorphism in most eukaryotes. At the same time, the conservative type of ITS is relatively consistent within species, and the differences between species (or ever stains) are obvious. ITS sequence fragments are small (350 bp and 400 bp in length, respectively) and easy to analyze. They have been widely used in phylogenetic analysis of different fungi.
What is 16S/18S/ITS amplicon sequencing?
16S/18S/ITS amplicon sequencing uses Illumina or PacBio sequencing to read the PCR products which are amplified with suitable universal primers of one or several regions of 16S/18S/ITS. By detecting the sequence variation and abundance of the target area, the information of species classification and abundance, population structure, phylogenetic evolution and community comparison of environmental samples could be obtained.
How to conduct a 16S/18S/ITS amplicon sequencing?
In short, the main steps of 16S/18S/ITS amplicon sequencing include library construction, sequencing and bioinformatics analysis.
- Library Construction:We recommend the fusion primer library construction method, that is, the primers fused with the target sequence primers and the adapter, index and other sequences are synthesized in advance, then the genomic DNA targets are directly amplified by PCR. Amplicon libraries are purified and an equimolar pool of the amplicon libraries is prepared. The dilution required for template preparation is determined and followed by sequencing.
- Sequencing: The current sequencing platforms mainly include Illumina Miseq/HiSeq and third-generation sequencing platform.
- Illumina NGS (MiSeq/HiSeq2500/HiSeq4000): Due to the limitation of reading length, the NGS platform can only select a single variable region, double variable regions or triple variable regions as the target regions for the sequencing. When sequencing, only the completely sequenced Reads (Tags) can be used for further analysis, so different amplification regions should strictly follow the corresponding sequencing strategy. For example, if you chose V4 for analysis, the PE250 sequencing is needed, but for V1-V3 regions, the sequencing strategy should be PE300. Only in this way can the completeness of sequences be ensured. The original data is filtered out to remove low-quality reads and leave high-quality clean data for later analysis.
- PacBio SMRT Sequencing: Unlike NGS, the third generation sequencing platform can carry out full-length sequencing for 16S/18S/ITS, and it’s sequence alignment rate and identification accuracy rate are higher than that of the NGS.
- Bioinformatics Analysis:Reads are spliced into Tags according to the Overlap relationship between reads, and tags are aggregated into OTUs with a given similarity, and then OTUs are annotated by comparing OTUs with databases.