Virome: a primer
“Virome refers to the assemblage of viruses that is often investigated and described by metagenomic sequencing of viral nucleic acids that are found associated with a particular ecosystem, organism or holobiont. The word is frequently used to describe environmental viral shotgun metagenomes.” (Wikipedia)
Reads profiling
Nowadays there are enough viral genomes deposited in the public repositories to allow us to profile the reads of a metagenomic sample against them. This approach has the advantages of being fast and - when used with separate datasets - enables a straightforward comparison of the results.
The main disadvantage is that novel viruses will not be detected at all, and they might play an important role in our community.
De novo virus mining
A different approach is to assemble our reads and then to try to identify which contigs could be complete or partial viral genomes. This approach can detect novel viruses, but can also be prone to false positives and requires some refinement.
In addition to this, the lack of universal markers makes the taxonomy assignment more complicated and we will see how to use vConTACT2 to investigate the relationship of the detected viral sequences with the known genomes.
The programme
- EBAME-22 notes: EBAME specific notes
- Gathering the reads: downloading and subsampling reads from public repositories (optional)
- Gathering the tools: we will use Miniconda to manage our dependencies
- Reads by reads profiling: using Phanta to quickly profile the bacterial and viral components of a microbial community
- De novo mining: assembly based approach, using VirSorter as an example miner
- Viral taxonomy: ab initio taxonomy profiling using vConTACT2
- MetaPhage overview: what is MetaPhage, a reads to report pipeline for viral metagenomics