dash/ash [LWN.net]

dash/ash

Posted Oct 1, 2014 21:04 UTC (Wed) by josh (subscriber, #17465) [Link] (2 responses)

Personally, I'd argue that if your shell script has gotten sufficiently complicated to warrant making it bash-specific, you should write it in something other than shell.

Bash can be a big win for many scripts

Posted Oct 2, 2014 13:51 UTC (Thu) by CChittleborough (subscriber, #60775) [Link]

I find bash nearly ideal for relatively simple scripts. OTOH, I've written a few scripts in bash that I later rewrote in Perl when adding functionality. (In my experience, any script that takes more than about 100 lines of Bash code probably should be rewritten into Perl or Python or something, as should any Bash script in which you get bugs or headaches from string substitutions.)

Useful bashisms

Posted Oct 2, 2014 14:52 UTC (Thu) by gwolf (subscriber, #14632) [Link]

Some bashisms are just too useful to dismiss. From the top of my mind, I often thank bash for being able to diff the output from two running programs without having to juggle them into named pipes:

$ diff -u <(cmd1) <(cmd2)

is way easier and clearer than

$ F1=`tempfile`
$ F2=`tempfile`
$ cmd1 > $F1 &
$ cmd2 > $F2 &
$ diff -u $F1 $F2
$ rm $F1 $F2

Of course, it makes sense to include this bashism in your scripts. And, of course, that'd make your scripts depend on bash.

dash/ash

Posted Oct 1, 2014 22:13 UTC (Wed) by wahern (subscriber, #37304) [Link] (23 responses)

There are various ways to handle lists of things in the POSIX syntax. One lesser-known way is by using the "set -- [...]" construct, which resets your local argument vector. A more common method is using read(1).

Many times people use Bash-specific features simply because they don't understand or care about POSIX syntax. I'd love to see a count of hands of shell programmers who have ever read the POSIX documentation for the shell. See http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html. The framed version is much easier to read: http://pubs.opengroup.org/onlinepubs/9699919799/

But as josh said, if you're writing software that really requires juggling lots of data, why keep the shell? Why not just switch to Perl, for example? I'd be surprised if Perl wasn't installed by default on at least as many vendor platforms as Bash. Unless your goal is sheer portability or simplicity, using the shell doesn't make any sense.

Bash in particular is hardly more light-weight than Perl. The Bash and Perl interpreters are comparable in size. Compare them to dash.

$ ls -sh /bin/bash
1000K /bin/bash

$ ls -sh /usr/bin/perl /usr/lib/libperl.so.5.18.2 
 12K /usr/bin/perl  1.6M /usr/lib/libperl.so.5.18.2

$ ls -sh /bin/dash 
120K /bin/dash

And, no, bash is not statically linked in the above. Assuming all the developers of those projects are equally good programmers, how many bugs do you think lurk in bash compared to dash? Are arrays really worth all that code surface? Perl might be bigger, but I'd bet Perl has fewer bugs just because the language grammar is simpler and more regular--go figure. And, yes, using the shell is fraught with quoting problems. Historically that was not only because the shell syntax mixes code and data in subtle ways, but because the shells themselves often had parsers which caused unexpected behavior (sometimes a bug, sometimes a "feature"). Using Bash features may address the former to some small extent, but it certainly doesn't address the latter.

dash/ash

Posted Oct 1, 2014 22:36 UTC (Wed) by Jandar (subscriber, #85683) [Link] (5 responses)

> One lesser-known way is by using the "set -- [...]" construct, which resets your local argument vector.

One global array is hardly a replacement for the bash arrays. Btw the set builtin is well-known.

> A more common method is using read(1).

Do you mean managing arrays in multiple tempfiles and reading (+ writing with ugly escapes) on any usage? Appalling.

The omission of arrays in posix-shell is the major reason for me to ignore it and use bash.

dash/ash

Posted Oct 2, 2014 14:23 UTC (Thu) by nix (subscriber, #2304) [Link] (3 responses)

There are things you can do with arrays stored as files that you can't do in any other way -- e.g. high-speed filtration and set-membership queries with comm(1) and grep. Since the primary priority when making a shell script fast enough to be useful is converting all loops into pipelines, and comm(1) is invaluable in that, I'm not sure why you'd ever want to use arrays in any other form, really.

dash/ash

Posted Oct 3, 2014 9:48 UTC (Fri) by Jandar (subscriber, #85683) [Link] (2 responses)

With arrays the construction of an argument-vector is easy.

typeset -a Options Files
Options+=("$option1")
Files+=("$file1")
$UseVerbose && Options+=("-v")
$UseSecondFile && Files+=("$file2")
command "${Options[@]}" -- "${Files[@]}"

How do you prepare (with correct quoting) a dynamic argument-vector without arrays? All other methods are ugly and error-prone beyond any acceptable limit.

dash/ash

Posted Oct 17, 2014 11:12 UTC (Fri) by mgedmin (subscriber, #34497) [Link] (1 responses)

Thank you. I've been reading bash(1) and tearing my hair out, and the only syntax I discovered for appending an item to an array was

paths[${#paths[*]}]="$1"

Rewriting this to use +=() will make my scripts a bit saner.

dash/ash

Posted Oct 17, 2014 12:02 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

Any reason path=( "$path[@]" "$1" ) wouldn't have worked?

dash/ash

Posted Oct 3, 2014 19:30 UTC (Fri) by wahern (subscriber, #37304) [Link]

Positional parameters are scoped to the function. So the set -- construct only changes the function-local argument vector. In the following example, note that invocation of foo doesn't reset the positional parameters of 1 2 3 4 in the outer scope.

foo() {
set -- A B C

printf "$# -> $*\n"
}

printf "$# -> $*\n"
set -- 1 2 3 4
foo
printf "$# -> $*\n"

Also, per POSIX: "Positional parameters are initially assigned when the shell is invoked (see sh), temporarily replaced when a shell function is invoked (see Function Definition Command), and can be reassigned with the set special built-in command."

Obviously this isn't a complete substitute for arrays, neither alone nor in tandem with other alternatives.

But my point is that people use Bash features without understanding the alternatives, and without appreciating the cost of Bash. Bash is a great interactive shell. But if you're serious about shell programming, one should learn POSIX syntax, as well as sed(1), awk(1), and the nuances of other utilities. This isn't hard because the POSIX documentation is infinitely more clear and concise than implementation documentation. A good way to practice is with Solaris, because Solaris' utilities are often annoyingly anachronistic (e.g. grep doesn't even have a recursive mode), but dash isn't a bad place to start, either. And when you do choose to use extended functionality, do so knowingly, having assessed the costs and benefits.

dash/ash

Posted Oct 1, 2014 23:16 UTC (Wed) by SEJeff (guest, #51588) [Link]

I for one have read exactly those page. But I'm also a weirdo that spent several nights and read all of the bash man page years ago. I also read the book on HP-UX's posix shell, but that is a different story.

dash/ash

Posted Oct 2, 2014 8:31 UTC (Thu) by ibukanov (subscriber, #3942) [Link] (11 responses)

> Why not just switch to Perl, for example?

Perl is not particularly safer with syntax that is similar in complexity to Bash. Besides, the shellshock comes not from the syntax but from a badly designed Bash feature that could be provided for "convenience" just as well in Perl or Python or JavaScript under node.js.

dash/ash

Posted Oct 2, 2014 9:41 UTC (Thu) by niner (subscriber, #26151) [Link] (7 responses)

Perl has use strict;
Perl does not mix program code with data.
Perl has a taint mode that catches many cases of missing input sanitation.
Perl's system() function only ever invokes /bin/sh if the given command contains shell metacharacters and supports the system PROGRAM LIST form that never ever invokes a shell (it uses exec()) at all and avoids many errors with missing parameter quoting.

So I'd argue that it is much easier to program safely in Perl than in Bash.

dash/ash

Posted Oct 2, 2014 15:54 UTC (Thu) by ibukanov (subscriber, #3942) [Link] (6 responses)

> it is much easier to program safely in Perl than in Bash.

I want to repeat that Shellshock has nothing to do with programming style of Bash scripts. It comes from a badly designed and implemented feature of the Bash interpreter that is written in C. Perl runtime could just as easily provide a similar "feature" affecting any Perl script, strict or not. For example, can you with certainty assert that Perl interpreter has no bugs related to reading of environment variables that could trigger execution of arbitrary Perl code?

dash/ash

Posted Oct 2, 2014 17:31 UTC (Thu) by dskoll (subscriber, #1630) [Link] (3 responses)

Perl runtime could just as easily provide a similar "feature" affecting any Perl script, strict or not.

But it doesn't. Well, with one exception: Setting PERL5DB will make perl execute arbitrary Perl code, but only if it has been invoked with the "-d" command-line flag which says to run under a debugger, and no Perl script uses that flag.

Perl makes environment variables available in the %ENV hash, but certainly doesn't try to interpret them as Perl code (modulo the single exception above.)

dash/ash

Posted Oct 5, 2014 20:15 UTC (Sun) by alankila (guest, #47141) [Link] (2 responses)

Environment variable PERL5INC would allow specifying library paths that Perl will look into first. If an attacker can control files in the system, then he can probably control the Perl interpreter through setting PERL5INC into suitable target path, and then PERL5OPT to load it.

Thankfully, none of this is even close to as bad as what bash did.

dash/ash

Posted Oct 7, 2014 14:35 UTC (Tue) by dskoll (subscriber, #1630) [Link] (1 responses)

PERL5LIB and PERL5INC are not used in taint mode. Bash really needs a taint mode.

dash/ash

Posted Oct 7, 2014 17:31 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

Bash needs an overlay:

 _____________________________________________ 
/ It looks like your script is over 100 lines \
\ long; did you mean to write this in Perl?   /
 --------------------------------------------- 
    \
     \
                                   .::!!!!!!!:.
  .!!!!!:.                        .:!!!!!!!!!!!!
  ~~~~!!!!!!.                 .:!!!!!!!!!UWWW$$$ 
      :$$NWX!!:           .:!!!!!!XUWW$$$$$$$$$P 
      $$$$$##WX!:      .<!!!!UW$$$$"  $$$$$$$$# 
      $$$$$  $$$UX   :!!UW$$$$$$$$$   4$$$$$* 
      ^$$$B  $$$$\     $$$$$$$$$$$$   d$$R" 
        "*$bd$$$$      '*$$$$$$$$$$$o+#" 
             """"          """""""

dash/ash

Posted Oct 2, 2014 21:25 UTC (Thu) by flussence (guest, #85566) [Link] (1 responses)

> Perl runtime could just as easily provide a similar "feature" affecting any Perl script, strict or not.

It's had that feature for decades: 2-arg open() will happily interpret any filename passed to it containing a "|" prefix or suffix to mean a command pipe, and helpfully give the rest of the string to the shell to run. The same function is also used internally to pass filenames in ARGV into the magic <> line-iterator.

dash/ash

Posted Oct 3, 2014 11:23 UTC (Fri) by dskoll (subscriber, #1630) [Link]

2-arg open() will happily interpret any filename passed to it containing a "|" prefix or suffix

That's a little different from the bash bug. It requires the programmer to write a script that doesn't handle user-input safely. It's also stopped in taint mode.

The Bash bug doesn't require any action on the part of the script writer; it happens before your script even has a chance to do anything.

Bash and Perl

Posted Oct 2, 2014 14:59 UTC (Thu) by gwolf (subscriber, #14632) [Link] (1 responses)

Perl's foremost use case is to be a programming language, Bash's is to be a programmable "things" launcher. And of course, while making it easier to be programmed, we have to remember it's still a command-line shell with lots of sugar on top. All of Perl's operations are done within the same process and memory space — Of course, convince perl to eval() or system() or `` your code, and it's game over. But you are very seldom fork()ing inside Perl — And when you do, it's clear to you an external thing is being called.

I do argue (see my above comment) for the utility of some sorts of bashisms, particularly those that help forked process' control and quoting (another favorite of mine is to use $() instead of ``, as it's *way* clearer. I won't argue for using Bash when you use arrays and local scopes — That's a clear indication you should switch to a real programming language. One that, between many other things, separates parsing/compiling from executing, because your program will most likely be complex enough to warrant it!

But yes, people complain about Perl being write-only. Bash constructs and mindset are way worse than Perl.

Bash and Perl

Posted Oct 2, 2014 15:38 UTC (Thu) by madscientist (subscriber, #16861) [Link]

Well, $() is POSIX. And "local" has been proposed for inclusion in POSIX sh and found to be generally available, in some form, in most sh implementations already.

As others have pointed out, if you want to write bash scripts that's fine and it's trivial to do: just start your script with #!/bin/bash. If you want to write POSIX shell scripts, start your script with #!/bin/sh. If you use #!/bin/sh and you use bash-isms, your script is simply wrong.

dash/ash

Posted Oct 9, 2014 11:30 UTC (Thu) by ssokolow (guest, #94568) [Link]

That sort of thing is why I encourage friends who are launching child processes to do their scripting in Python using the subprocess module.

They really did a great job on designing its API... especially when paired with various other modules already part of stdlib.

It's guaranteed to execvp() the requested binary directly without shell indirection unless you explicitly use shell=True
Any necessary argument parsing and expansion can be done without arbitrary code execution by using shlex.split(), os.path.expanduser(), fnmatch.filter(), glob.glob() modules from the Python standard library.
Quoted strings can still be handled safely by using shlex to explicitly perform argument splitting without code execution before using subprocess.
The env argument makes it easy to call a subprocess with a sanitized environment.
The cwd argument avoids the need for cding in os.system() or doing an os.getcwd() os.chdir() dance.

Apparently someone's also ported it to ruby though, unfortunately, it's not part of stdlib there and I don't know whether shlex is also available.

Plus, of course, convenience functions like subprocess.call(), subprocess.check_call(), and subprocess.check_output() integrate nicely with the mix of try/except/finally and os.walk() I already recommend for that sort of scripting.

dash/ash

Posted Oct 2, 2014 13:49 UTC (Thu) by mbunkus (subscriber, #87248) [Link] (3 responses)

> But as josh said, if you're writing software that really requires juggling lots of data, why keep the shell?

As much as I love and use Perl it doesn't have several things that I find extremely useful for scripting: set -e, set -x and zsh's globbing functionality. I don't know of an equivalent for either of the »set«s, and especially -e is truly a very effective way of preventing accidental mishaps.

Then again: for me it's a question of when to switch from zsh to Perl, not from bash to Perl. zsh can do a lot of things that bash cannot, therefore I do get further via shell scripts than I would with bash; meaning the gain of switching from bash to Perl is usually higher than for zsh to Perl.

dash/ash

Posted Oct 3, 2014 9:42 UTC (Fri) by cortana (subscriber, #24596) [Link] (2 responses)

Beware, you can't rely on set -e all the time. Try using it in a command list executed from an if statement some time...

dash/ash

Posted Oct 3, 2014 16:43 UTC (Fri) by mbunkus (subscriber, #87248) [Link] (1 responses)

Care to give me an example? My short tests don't show any problem, therefore I don't seem to understand what you mean.

dash/ash

Posted Oct 7, 2014 9:05 UTC (Tue) by drothlis (guest, #89727) [Link]

See http://david.rothlis.net/shell-set-e/

Either I write dash or I use Python

Posted Oct 1, 2014 22:25 UTC (Wed) by debacle (subscriber, #7114) [Link]

I like bash as interactive login shell, but for programming I either choose the fast and lightweight dash or I use Python (or another "real" programming language). If the features of dash are not sufficient to write a good script, it is very likely, that bash would not solve the problem in the long term anyway.

dash/ash

Posted Oct 2, 2014 6:05 UTC (Thu) by Karellen (subscriber, #67644) [Link]

If you want to specifically write a bash shell script to take advantage of its useful features, you still can. Even on Debian and derivatives. You just need to use the shebang line "#! /bin/bash" rather than "#! /bin/sh".

dash/ash

Posted Oct 2, 2014 6:19 UTC (Thu) by ptman (subscriber, #57271) [Link]

Shellcheck is a nice tool that helps with catching problems in shell scripts (both sh and bash)

dash/ash

Posted Oct 2, 2014 11:00 UTC (Thu) by eru (subscriber, #2753) [Link]

Adding a selection of the more scripting-friendly bashisms to dash might not bloat it. For example, the array feature looks simple: one-dimensional only, integer indices, and dash already supports evaluating integer expressions in the context of $((expr)) expansion. Of course, identifying which are the most useful extensions (without taking all bash features) is a problem.

dash/ash

Posted Oct 4, 2014 7:03 UTC (Sat) by tomgj (guest, #50537) [Link] (1 responses)

The thing about moving to dash or ash for shell scripts is that some of the "bashisms" are truly useful features.

Okay but that doesn't mean that "sh" needs to be bash. The scripts that require bashisms can use bash explicitly by name, leaving sh to be a more minimal implementation of the POSIX shell.

dash/ash

Posted Oct 9, 2014 10:59 UTC (Thu) by Tet (guest, #5433) [Link]

Better still, write scripts in ksh instead. For interactive use, bash still gets my vote. But ksh does a better job of scripting.