Microbiome analysis with MOTHUR

…BRIEF INTRO IN PROGRESS…


A tentative snakemake workflow that defines mothur bioinformatics rules in a DAG (directed acyclic graph) format. A detailed interactive snakemake HTML report is available here. Use a wider screen to get a better interactive snakemake report.


Getting started with Mothur pipeline

Create a mothur YAML file

name: mothur48
channels:
    - conda-forge
    - bioconda
    - defaults
dependencies:
    - mothur =1.48.0
    - vsearch =2.22.1

Create mothur env using the YAML file

conda activate base
conda env create -n mothur48 --file mothur48.yml
conda activate mothur48 

Download references databases

  • Download Silva alignment reference database.
  • Creatge Silva classifier from Silva alignments.
  • Download RDP classifier.
bash workflow/scripts/mothurReferences.sh


Overview of Mothur classification methods

There are four methods that can be used to profile microbial communities present in a sample. Here we briefly decribe each method:

1.Classify OTUs

  • OTUs (Operational Taxonomic Units (OTUs)) are clusters of similar sequences and are commonly accepted as analytical units in microbial profiling when using 16S rRNA gene markers.

2. Classify Phylotypes

  • A phylotype in microbiome research is a DNA sequence or group of sequences sharing more than an arbitrarily chosen level of similarity of a 16S rRNA gene marker.

3. Classify ASVs

  • ASVs Amplicon Sequence Variants (ASVs)in microbiome research is any inferred single DNA sequences recovered from a bioinformatics analysis of 16S rRNA marker genes.
  • ASV is typically really a cluster of sequences that are one or two bases apart from each other.

4. Classify Phylogenies

  • Microbial phylogenies are from gene sequence homologies. Models of mutation determine the most-likely evolutionary histories.


Preliminary OTU analysis using Mothur

The preliminary analysis (alpha_beta_diversity rule) is part of the bioinformatics analysis. It includes:

  • Creating reads count for each group.
  • Subsampling for downstream analysis.
  • Rarefaction.
  • Computing Alpha diversity metrics.
  • Computing Beta diversity metrics.
  • Getting sample distances.
  • Constructing sample phylip tree.
  • Generating ordination matrices including PCoA and NMDS.



Citation

Please consider citing the iMAP article[1] if you find any part of the IMAP practical user guides helpful in your microbiome data analysis.


References

[1]
Buza, T. M., Tonui, T., Stomeo, F., Tiambo, C., Katani, R., Schilling, M., … Kapur, V. (2019). iMAP: An integrated bioinformatics and visualization pipeline for microbiome data analysis. BMC Bioinformatics, 20. https://doi.org/10.1186/S12859-019-2965-4



Appendix

Project main tree

.
├── LICENSE.md
├── README.md
├── config
│   ├── config.yml
│   ├── pbs
│   ├── samples.tsv
│   ├── slurm
│   └── units.tsv
├── current_files.summary
├── dags
│   ├── rulegraph.png
│   └── rulegraph.svg
├── data
│   ├── assembled
│   ├── logs
│   ├── metadata
│   ├── reads
│   └── references
├── files.tsv
├── images
│   ├── 16srrna.png
│   ├── bioinformatics.png
│   ├── bkgd.png
│   ├── imap_part02.svg
│   ├── imap_part03.svg
│   ├── imap_part04.svg
│   ├── imap_part05.svg
│   ├── silvaalign.png
│   ├── smkreport
│   └── sra_config_cache.png
├── imap-bioinformatics-mothur.Rproj
├── index.Rmd
├── library
│   ├── apa.csl
│   ├── export.bib
│   ├── imap.bib
│   └── references.bib
├── mothur.1682117235.logfile
├── mothur.1682120066.logfile
├── mothur.1682120067.logfile
├── mothur.1682124402.logfile
├── mothur.1682124403.logfile
├── mothur.1682128146.logfile
├── mothur.1682128147.logfile
├── mothur.1682128148.logfile
├── mothur.1682136420.logfile
├── mothur.1682136421.logfile
├── mothur.1682139538.logfile
├── mothur_process
│   ├── asv_analysis
│   ├── current_files.summary
│   ├── error_analysis
│   ├── final.count_table
│   ├── final.fasta
│   ├── final.taxonomy
│   ├── final_files.logfile
│   ├── intermediate
│   ├── logs
│   ├── otu_analysis
│   ├── phylogeny_analysis
│   ├── phylotype_analysis
│   ├── test.files
│   ├── test.filter
│   └── test.scrap.contigs.fasta
├── report.html
├── resources
│   ├── metadata
│   ├── references
│   └── test
├── results
│   └── project_tree.txt
├── samples.tsv
├── smk.css
├── styles.css
├── units.tsv
└── workflow
    ├── Snakefile
    ├── envs
    ├── report
    ├── rules
    └── scripts

31 directories, 50 files



Troubleshooting of FAQs

  1. Question
    • Answer
  2. Question
    • Answer