2 Check Raw Reads Statistics

  • Assuming that most QC tools are ready, it is time to use them to do the following:
    • Check the quality of the reads using fastqc.
    • Create a summary report of quality metrics using multiqc.
    • Trim poor read at a user-specified cutoff using bbduk.sh.
    • Remove contaminants bbduk.sh.

2.1 Statistics of raw reads

#!/bin/bash

echo PROGRESS: Getting stats of the raw reads.

INPUTDIR="resources/reads"
SEQKIT="results/qc/seqkit1"

mkdir -p "${SEQKIT}"
seqkit stat "${INPUTDIR}"/*.fastq.gz >"${SEQKIT}"/seqkit_stats.txt

2.2 FastQC - MultiQC on raw reads


#!/bin/bash

echo PROGRESS: FastQC - Getting quality scores of raw reads.

INPUTDIR="resources/reads"
FASTQC="results/qc/fastqc1"
mkdir -p "${FASTQC}"
fastqc "${INPUTDIR}"/*.fastq.gz -o "${FASTQC}"
#!/bin/bash

echo PROGRESS: MultiQC - Getting summary of raw read quality scores.

FASTQC="results/qc/fastqc1"
MULTIQC="results/qc/multiqc1"
mkdir -p "${MULTIQC}"
multiqc --force --data-dir "${FASTQC}" -o "${MULTIQC}" --export