seqfu list

Extract sequences from sequence files using a list of requested items. Introduced in SeqFu 1.8.

Usage: list [options] <LIST> <FASTQ>...

Print sequences that are present in a list file.
Duplicated entries in the list will be ignored

Other options:
  -c, --with-comments    Include comments in the list file
  -p, --partial-match    Allow partial matches (UNSUPPORTED)
  -m, --min-len INT      Skip entries smaller than INT [default: 1]

  -v, --verbose          Verbose output
  -r, --report           Print report of found sequences
  --help                 Show this help
  

Input

The list file is a simple text file with sequence names, that can contain the comments and they can have a leading > or @ characters (which would be discarded).

By default, if comments are present in the list they are ignored and the match is only at the sequence name level, unless the --with-comments option is used.

Output

The standard output is in the same format as the input files, either FASTA or FASTQ.

With --report the full input list is printed with the total number of sequences printed.

Example report:

# SEQUENCES REPORT
# Sequence 'protein.1c;size=5372' found 1 times
# Sequence 'protein.1d;size=5372' found 1 times
# Sequence 'protein.missing' found 0 times
# Sequence 'protein.1a;size=5372' found 1 times
# Sequence 'protein.1f;size=5372' found 1 times
# Sequence 'protein.notfound' found 0 times
# Sequence 'protein.1b;size=5372' found 1 times
Total sequences found: 5/7