Andrea Telatin
Andrea Telatin Senior bioinformatician at the Quadram Institute Bioscience, Norwich.

Bash script safety net

Bash script safety net

We can instruct our scripts to stop when a problem is found with the set -euo pipefail directive.

In the past articles we introduced some small examples of Bash scripts:

  • what they are,
  • how to loop and
  • how to check conditions. The last topic is very important, as it let us make our scripts safer. Unfortunately checking for every possible error is time consuming, and the more “if conditions” we have to write, the more errors we end to add to our scripts.

Here’s why, before introducing further topics for Bash scripting, I’d like to introduce a safety net to prevent some common mistakes.

The following script is an hypothetical pipeline looping through .fastq files, trimming, aligning and finally converting the aligned SAM file to BAM¹.

What can possibly go wrong?

  • First, there is a typo: we pretend to align $IPUT, instead of $INPUT. Since the variable was never declared, Bash will not find any value and so will pass to bwa “trimmed_” as filename. That does not exist.
  • Suppose that we moved the reference. The variable is valid and contains a value, but the bwa program is not able to perform the alignment and will fail. We will not easily spot the error: the script will go on and convert an empty SAM file into BAM…

The two problems are: using uninitiated (thus empty) variables, and not stopping the script execution if a program fails.

##How can we solve these problems?

Adding a single line at the beginning of the script (just after the shebang): set -euo pipefail This instruction² will make the script fail, and exit, when one of the above problems if found. In particular, the switches used are:

  • -e, will cause the script to exit if a single program fails.
  • -o, if you use pipes (we did in the past articles), the -e would only be triggered if the last program of the pipe fails. With -o also the other will be checked.
  • -u, for unbound variables. When this is added, if a variable has not been declared before, the script will fail.
  • -x, this is not put in the example, but can be useful in some cases: it will cause the script to print every command before executing it. Useful when debugging or testing scripts.

Our script has still an unfixed problem: the “*.fastq” will be expanded only if we have at least one .fastq file in our current directory. To solve this we can add this directive:

1
shopt -s failglob

It’s a directive that — similarly to “set -euo pipefail” will be active throughout the whole script, and will make the script fail when a shell expansion fails.

Check if a variable is set

If you use the -u switch, and you didn’t define a variable, checking if it has a value will cause the script to terminate, unless you check it with the following syntax:

Safety net make life harder

You know this from everyday experience: sometimes a safety device (like a lab coat) requires extra time and steps and for small tasks, we believe we can simply skip it. As a beginner, you’ll find most of the safety nets to create more problems than the one they solve and you might be tempted never using them. I perfectly understand this feeling, being myself a beginner in many topics, but it’s important to — at least — being aware of the possible problems that not using a safety net could bring.

Presented at the lecture on Metagenomics Analysis, La Sapienza — Rome, December 2018