5 Read Quality Control
5.1 Alignment of representative sequences
- The MAFFT (Multiple Alignment using Fast Fourier Transform) software provides alignments of the representative sequences.
- Then we will run alignment mask function to remove poor alignments.
5.2 Quality control and feature table with DADA2
QIIME2 uses DADA2[2] tool for:
- Detecting poor reads in Illumina amplicon sequence data.
- Denoising.
- Filtering chimeric sequences.
- Filtering any phiX reads present in marker gene.
- Construction of feature table.
5.3 Using custom SILVA classifier
- Silva resources
- Taxonomy files
- Below is a simple outline of the steps involved for constructing a QIIME 2 compatible reference from SILVA.
- Begin by downloading the relevant taxonomy and sequence files from the SILVA.
- Import these files into QIIME 2.
- Prepare a fixed-rank taxonomy file.
- Remove sequences with excessive degenerate bases and homopolymers.
- Remove sequences that may be too short and/or long. With the option to condition the length filtering based on taxonomy.
- Dereplicate the sequences and taxonomy.
- Build our classifier.