set -eu is the lesser of two evils, if you have to write shell at all

2021-07-01

The other day I wrote about shell scripts, and since then I was surprised to learn that some of my recommendations are not considered good advice by experienced shell programmers.

In particular, bash programmers tend to recommend against set -e and set -u.

They also recommend against set -o pipefail, which I agree with because (aside from being a bashism) it would not permit many pipelines at all. That is, in a trivial cat file | head 1, cat exits with a nonzero return value as the result of receiving the SIGPIPE signal.

See also:

Why do they recommend against set -e and set -u?

The Bash Hackers Wiki says of set -e:

Do not be tempted to think of this as “error handling”; it’s not, it’s just a way to find the place you’ve forgotten to put error handling. … The set -e feature generates more questions and false bug reports on the Bash mailing list than all other features combined!

And of set -u:

Like set -e, it bypasses control flow and exits immediately from the current shell environment. Like non-zero statuses, unset variables are a normal part of most non-trivial shell scripts.

The BashFAQ goes into a lot more detail about what triggers set -e to exit and what doesn’t. The simplest description of set -e is that it ends script execution on a nonzero exit code, but it’s actually more complicated than that, and therein lies the trouble. Nonzero exit codes in a pipeline or as part of an if conditional do not cause the script to exit – and of course they cannot, as these are intended. There are subtler rules as well.

I agree with their reasoning, but not their conclusions

The crux of our disagreement is highlighted by this sentence from Bash Hackers Wiki:

Like non-zero statuses, unset variables are a normal part of most non-trivial shell scripts.

That’s true, they are. To me, this is not an indictment of set -eu, but an indictment of shell programming.

Their conclusion is that because set -eu prohibits behavior that is part of most non-trivial shell scripts, set -eu should be avoided. My conclusion is that non-trivial shell scripts are what should be avoided, and enforcing limits with set -eu is a way to ensure that my shell scripts do not cross the boundary into non-triviality. This is why I recommend shell scripts be shorter than a few PgDns. It’s even part of why I think it’s useful to conform strictly to POSIX and eschew bashisms and other extensions altogether. When those restrictions start to cause problems for your script, it’s a good time to reconsider whether your logic is best suited to shell in the first place.

I suppose it isn’t a surprise that the Bash Hackers Wiki and the authors of the BashFAQ are content writing “non-trivial” shell scripts. In my opinion, you should avoid them anyway.

Also, set -e and set -u are the lesser of two evils

Even if the programming language is constrainted to shell, I think to set -eu is better than not. While I agree that they have significant limitations, I use them anyway, even when inconvenient, because I think the alternative is worse.

The classic case is a misspelled variable containing a directory, and rm against a glob of that directory. E.g.:

$dir=/tmp/whatever
rm -rf $dri/*

With set -u, it will exit immediately; without set -u, it will attempt to rm -rf your root filesystem. There’s a similar trivial case for set -e:

cd /somewhere/you/think/exists
rm -rf *

The set -e behavior of having the script stop execution immediately if cd fails is clearly preferable to trying to push onward with a completely different context than the script was written to expect.

More fundamentally, my contention is this: if I have not handled an error properly, I almost always want the script to stop executing immediately so that it doesn’t fuck anything else up. I’ve written scripts dozens of lines long without set -e and when an early command fails unexpectedly, the rest of the script just barrels forward, sometimes destructively.

The explicit recommendation from the Bash Hackers Wiki page is “proper control flow and error handling”, but this advice is both dismissive and inactionable. I use set -eu to prevent my script from continuing to execute when it gets into an unexpected state. “If I were writing a shell script, I would simply not let my script get into an unexpected state”. I mean sure, ok. For the rest of us, stopping execution is the only sane thing to do in that circumstance.