Andrea Telatin
Andrea Telatin Senior bioinformatician at the Quadram Institute Bioscience, Norwich.

Tools: using Miniconda and containers

Tools: using Miniconda and containers

Gathering the tools

We will use nextflow itself to orchestrate our pipeline, which will use:

  • Fastp, for read filtering
  • Shovill, to assemble the reads
  • Abricate, to identify AMR genes
  • Prokka, to perform a full annotation
  • MultiQC, to generate an HTML report

We can create a Miniconda (see tutorial) environment with the tools we need. If we have specific requirements for one or more tools, we can pin its version.

As an example, we will request Shovill to be 1.1.0 and MultiQC greater or equal to 1.10.

1
2
# Create a new environment called "DenovoPipeline" with the requested tools
mamba create -n DenovoPipeline fastp shovill=1.1.0 abricate prokka "multiqc>=1.10"

To experiment with our tools we will need to activate the environment, with:

1
conda activate DenovoPipeline

Sharing our environment

Note that this command will produce a different environment if run in two months: some tools might be updated. This can be true even if you pin all the versions: some of their dependencies might be updated!

We can export our current environment as a YAML file that will allow us to

1
2
# Save the current environment as "denovo.yaml"
conda env export --file denovo.yaml

Creating a container from our environment

:movie_camera: Need a refresher video on Docker? or on Singularity?

Basing our workflow on conda packages allows us to generate a containers with the same dependencies.

We can build either a Docker or Singularity image.

We can either install Miniconda and then create an environment from the YAML file, or start from a Miniconda image that has conda already installed and proceed. Note that if you push your Docker image to a public hub, like Docker Hub, you can create a Singularity image from that.

In our repository you will find a directory with a Dockerfile (to build a Docker image, in our example we will start from Miniconda) and a Singularity definition file (to build a Singularity image, in this case from an empty Centos image).


The programme

  • :one: A de novo assembly pipeline: we will design a simple workflow to assemble and annotate microbial genomes
  • :two: Gathering the tools: we will use Miniconda to gather our required tools, and generate Docker and Singularity containers manually (Nextflow can automate this step, but it’s good to practice manually first)
  • :three: First steps with Nextflow: we will install Nextflow and run a couple of test scripts
  • :four: The de novo pipeline in Nextflow: we will implement our pipeline in Nextflow

:arrow_left: Back to the Nextflow main page