Andrea Telatin Follow Senior bioinformatician at the Quadram Institute Bioscience, Norwich.

Gathering the (virome) reads

The goal of this section is to get a set of reads to test our programs with

An example dataset can be gathered from the paper by Liang et al. “The stepwise assembly of the neonatal virome is modulated by breastfeeding” (2020).

The reads are available from the NCBI SRA under the accession number PRJNA524703.

From the study, we selected 10 samples (5 C-section delivery and 5 vaginal delivery), having the following IDs (and partial metadata):

Sample	Feeding_type	Formula_type	Delivery_type	Gender
SRR8653245	Formula	cow-milk	C-Section	Female
SRR8653218	Formula	cow-milk	C-Section	Male
SRR8653221	Formula	cow-milk	C-Section	Male
SRR8653248	Formula	cow-milk	C-Section	Male
SRR8653247	Formula	soy-protein	C-Section	Female
SRR8653084	Formula	cow-milk	Spontaneous delivery	Female
SRR8652914	Formula	cow-milk	Spontaneous delivery	Female
SRR8652969	Formula	cow-milk	Spontaneous delivery	Male
SRR8652861	Formula	cow-milk	Spontaneous delivery	Female
SRR8653090	Formula	cow-milk	Spontaneous delivery	Female

They are all stool samples from 4 months old infants.

These 10 samples were re-analysed in our MetaPhage pipeline paper, and we will call them the “full” dataset.

Downloading the reads using Docker

Create a file with a list of desired SRA codes, called list.txt. An example of the content can be:

SRR8653245
SRR8653218
SRR8653221
SRR8653084
SRR8652914
SRR8652969

For the EBAME workshop the reads are pre-downloaded

Then we can use a NextFlow pipeline to automatically download the reads (and the needed tools). If we use Miniconda as dependency manager, we can run the following command:

nextflow run telatin/getreads -r main \
   --list list.txt -profile conda

If Docker is available, we can replace the -profile conda with -profile docker.

The programme

EBAME-22 notes: EBAME-7 specific notes
Gathering the reads: downloading and subsampling reads from public repositories (optional)
Gathering the tools: we will use Miniconda to manage our dependencies
Reads by reads profiling: using Phanta to quickly profile the bacterial and viral components of a microbial community
De novo mining: assembly based approach, using VirSorter as an example miner
Viral taxonomy: ab initio taxonomy profiling using vConTACT2
MetaPhage overview: what is MetaPhage, a reads to report pipeline for viral metagenomics

Back to the main page

10 Feb 2022

« EBAME-7 Viromics notes Gathering the (virome) tools »

Microbiome binfies

Gathering the (virome) reads

Downloading the reads using Docker

The programme

Explore →