Kaiju
Kaiju can be installed, as usual, from Conda. Like Kraken2, we have access to pre-built databases, and for this tutorial we used the nr 2021-02-24 (52 GB).
1
2
kaiju -t $DB/kaiju/nodes.dmp -f $DB/kaiju/kaiju_db_nr.fmi -o kaiju.tsv -z 32 -v \
-i subsampled_R1.fq.gz-j subsampled_R2.fq.gz
A typical line of Kaiju’s output looks like:
1
C RL|S1|R549 55507 259 55507, WP_072934244.1, NTMTAGLVASYIGRITAAWNAENIGTPPIELITRTWFNPNQTTRWAFLPG,
- Classified / Unclassified
- Read name
- NCBI TaxID
- Length/Score of the best match
- Comma separated list of all the matches (TaxIDs)
- Comma separated list of aminoacidic matches
Generate a report
Kaiju won’t generate a report on-the-fly, but ships a program to do one (that can be automatically imported by MultiQC).
1
2
3
4
5
6
7
# Phylum level
kaiju2table -t /data/db/kaiju/nodes.dmp -n /data/db/kaiju/names.dmp \
-r phylum -o kaiju-phylum.tsv kaiju.tsv
# Species level
kaiju2table -t /data/db/kaiju/nodes.dmp -n /data/db/kaiju/names.dmp \
-r species -o kaiju-species.tsv kaiju.tsv
The output is in TSV format:
1
2
3
4
file percent reads taxon_id taxon_name
kaiju.tsv 15.465896 308688 55507 Schwartzia succinivorans
kaiju.tsv 12.757531 254631 1004304 Hydrotalea sandarakina
kaiju.tsv 8.060514 160882 1736532 Massilia sp. Root418
Exporting to Krona
Kaiju also ships a small utility to prepare a tabular file to be imported in Krona.
If we want the unclassified to be reported, we need to add the -u
flag.
1
2
3
4
5
6
# Prepare the Krona input
kaiju2krona -t /data/db/kaiju/nodes.dmp -n /data/db/kaiju/names.dmp \
-i kaiju.tsv -o kaiju.krona -u
# Plot with Krona
ktImportText -o kaiju.out.html kaiju.krona
A complete script
Coherently with the rest of the workshop: