Memory usage
BamToCov is an extremely memory efficient tools, and the only one not using a vector (as long as the chromosome) to store the changes in coverage.
Results
Memory usage (in bytes) has been measured for alignments against:
- Candida parapsilosis, Illumina dataset
- Candida parapsilosis, Nanopore dataset
- Human exome HG00258, Illumina
- Human targeted sequencing, 16 genes, Illumina
Fungus, Illumina | Fungus, ONT | HG00258 Exome | Human panel | |
---|---|---|---|---|
bamtocov | 2,740 | 4,376 | 5,700 | 2,172 |
covtobed | 4,080 | 5,008 | 6,588 | 4,052 |
mosdepth | 13,952 | 19,140 | 1,983,928 | 6,425,744 |
megadepth | 11,644 | 11,636 | 995,232 | 980,040 |
bedtools | 12,940 | 14,876 | n/a | 1,951,288 |
For reference, this is the speed:
Command | Mean (ms) | Min (ms) | Max (ms) | Relative |
---|---|---|---|---|
bamtocov "panel_01.bam" | 358.9 ± 18.3 | 341.6 | 387.9 | 1.00 |
covtobed "panel_01.bam" | 533.2 ± 8.8 | 527.1 | 548.4 | 1.49 ± 0.08 |
megadepth --coverage "panel_01.bam" | 9246.8 ± 1509.7 | 8026.7 | 11072.8 | 25.77 ± 4.41 |
mosdepthprefix "panel_01.bam" | 53499.6 ± 875.1 | 52284.1 | 54548.5 | 149.08 ± 7.97 |
Scripts
memusg
Evaluation of memory usage has been performed with memusg
bu Jaeho Sigh, as reported below:
#!/usr/bin/env bash
# memusg -- Measure memory usage of processes
# Usage: memusg COMMAND [ARGS]...
#
# Author: Jaeho Shin <netj@sparcs.org>
# Created: 2010-08-16
############################################################################
# Copyright 2010 Jaeho Shin. #
# #
# Licensed under the Apache License, Version 2.0 (the "License"); #
# you may not use this file except in compliance with the License. #
# You may obtain a copy of the License at #
# #
# http://www.apache.org/licenses/LICENSE-2.0 #
# #
# Unless required by applicable law or agreed to in writing, software #
# distributed under the License is distributed on an "AS IS" BASIS, #
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #
# See the License for the specific language governing permissions and #
# limitations under the License. #
############################################################################
set -um
# check input
[[ $# -gt 0 ]] || { sed -n '2,/^#$/ s/^# //p' <"$0"; exit 1; }
# TODO support more options: peak, footprint, sampling rate, etc.
pgid=$(ps -o pgid= $$)
# make sure we're in a separate process group
if [[ "$pgid" == "$(ps -o pgid= $(ps -o ppid= $$))" ]]; then
cmd=
set -- "$0" "$@"
for a; do cmd+="'${a//"'"/"'\\''"}' "; done
exec bash -i -c "$cmd"
fi
# detect operating system and prepare measurement
case $(uname) in
Darwin|*BSD) sizes() { /bin/ps -o rss= -g $1; } ;;
Linux) sizes() { /bin/ps -o rss= -$1; } ;;
*) echo "$(uname): unsupported operating system" >&2; exit 2 ;;
esac
# monitor the memory usage in the background.
(
peak=0
while sizes=$(sizes $pgid)
do
set -- $sizes
sample=$((${@/#/+}))
let peak="sample > peak ? sample : peak"
sleep 0.1
done
echo "memusg: peak=$peak" >&2
) &
monpid=$!
# run the given command
exec "$@"
Memory.sh
To compare the memory usage of multiple tools:
echo MegaDepth:
memusg /local/miniconda3/envs/mega/bin/megadepth --coverage $1 > /dev/null
sleep 3; echo ""
echo BamToCov:
memusg bin/bamtocov $1 > /dev/null
sleep 3; echo ""
echo CovToBed:
memusg covtobed $1 > /dev/null
sleep 3; echo ""
echo Mosdepth:
memusg mosdepth /tmp/prefix $1
sleep 3; echo ""