Usage
Tools
BamToCov is inspired by the UNIX Phylosophy and the tools are designed for efficient computation of a very specific task. Integration of multiple samples and specific tasks can be achieved with scripts and we provide a set to demonstrate the process.
bamtocov will produce a coverage BED from a single BAM file, or a count matrix from a set of alignments and a target (in BED, GTF or GFF format). Used without a target, it is a drop-in replacement for covtobed, but discarding invalid alignments by default. When providing the target, it can produce coverage statistics for each region in the target, also with multiple BAM files.
bamtocounts will count the number of reads covering each target region, rather than the nucleotidic coverage
bamcountrefs is a shortcut to count the number of reads per chromosome, with filters on the read flags, length and quality
covtotarget (legacy) is an utility to create a count table from the output of the original covtobed program.
Quick start
bamtocov alignment.bam > coverage.bed
will produce a coverage BED file from the alignment file.
File formats
BED files
A BED file (.bed) is a tab-delimited text file that defines a feature track. In this context the magnitude refers to the nucleotide coverage of the interval.
The columns are chromosome name, start position (inclusive, zero-based), end position (non-inclusive, zero-based) and coverage. An example is:
seq1 0 9 0
seq1 9 109 5
seq1 109 189 0
seq1 189 200 2
Target statistics
this format is not final.
For each sample, 5 columns are printed:
bam_bases
bam_mean
bam_min
bam_max
bam_length
interval | bam_bases | bam_mean | bam_min | bam_max | bam_length |
---|---|---|---|---|---|
target1_8X | 699 | 3.495 | 1 | 6 | 200 |
target2_0X | 0 | 0.0 | 0 | 0 | 50 |
target3_1X | . | . | . | . | . |
for_rev_10Xa | 100 | 10.0 | 10 | 10 | 10 |
for_rev_10Xb | 100 | 10.0 | 10 | 10 | 10 |
for_rev_10Xc | . | . | . | . | . |