fu-sw
Simple implementation of the Smith-Waterman alignment:
Usage: fu-sw [options] -q QUERY -t TARGET
Options:
-q --query <FILE> File with the sequence(s) to align against target
-t --target <FILE> File with the target sequence(s)
-i --id ID Align only against the sequence named `ID` in the target file
-s --showaln Show graphical alignment
Smith-Waterman options:
--score-match INT Score for a match [default: 10]
--score-mismatch INT Score for a mismatch [default: -5]
--score-gap INT Score for a gap [default: -10]
--min-score INT Minimum alignment score [default: 80]
--pct-id FLOAT Minimum percentage of identity [default: 85]
Other options:
--pool-size INT Number of sequences/pairs to process per thread [default: 20]
-v --verbose Verbose output
-h --help Show this help
Input files
Input files can be in FASTA or FASTQ format, and both query and target can hold multiple sequences even if the common application is to have a single sequence in the target file.
If the target file contains multiple sequences but only one is the intended target, the target can be specified with --id
parameter.
Example output
The output will print the alignment score and coordinates in a single line after QUERY
and TARGET
. If --showaln
is specified, a graphical summary of the local alignment is provided.
# QUERY: not_in_target
## TARGET: ecoli
# QUERY: 16S_1_for_ins
## TARGET: ecoli
Score: 406 (97.18%) Length: 69 Strand: + Query: 0-71 Target: 21-90
GCTCAGATTGAACGCTccGGCGGCAGGCCTAACACATGCAAGTCGAACGGTAACAGGAAGCAGCTTGCTGC
|||||||||||||||| |||||||||||||||||||||||||||||||||||||||||||||||||||||
GCTCAGATTGAACGCT--GGCGGCAGGCCTAACACATGCAAGTCGAACGGTAACAGGAAGCAGCTTGCTGC
# QUERY: 16S_2_rev
## TARGET: ecoli
Score: 312 (100.00%) Length: 52 Strand: - Query: 0-52 Target: 175-227
CGCATAATGTCGCAAGACCAAAGAGGGGGACCTTCGGGCCTCTTGCCATCGG
||||||||||||||||||||||||||||||||||||||||||||||||||||
CGCATAATGTCGCAAGACCAAAGAGGGGGACCTTCGGGCCTCTTGCCATCGG
Release note
From 1.19.0 the algorithm has been rewritten using only standard libraries, while the initial implementation used the neo library for storing matrices. This resulted in a 2X speedup.