|
|
Subscribe / Log in / New account

How programs get run

How programs get run

Posted Jan 29, 2015 6:16 UTC (Thu) by wahern (subscriber, #37304)
Parent article: How programs get run

After checking those first two bytes, this code parses the rest of the script-invocation line, splitting it into an interpreter name (everything after #! up to the first white space) and possible arguments (everything else up to the end of the line, stripping external white space).
Linux passes the remainder of the line as a _single_ argument. You show this in your example where "-a -b -c" are all located in argv[1]. But you say
... a third extra argument is also inserted, holding all of the extra options:

Those aren't extra options--the plural is misleading. The distinction matters because neither getopt nor getopt_long will parse "-a -b -c" as three separate options. Rather, it'll be parsed as optc='a' and optarg=" -b -c", or it will parse as optc='a', optc=' ', optc='-', optc='b', etc. Most likely it'll just fail because your option specification won't match the parse. If a, b, and c are all single options without arguments, then you could put "-abc" on the shebang line. But you can't space them out, and you can't use an option that takes an argument unless the argument is the path of the script, as with the -f option for awk. And you can't mix non-argument with argument options unless the sole argument-taking option comes last. For example, "-abcf".

OS X, by contrast, will field-split the trailing shebang line in the kernel so that the script "#!./show_info -a -b -c" will print out

argv[0] = './show_info'
argv[1] = '-a'
argv[2] = '-b'
argv[3] = '-c'

Solaris is quirky. It will field-split, but only includes the first field. So "#!./show_info -a -b -c" will print out

argv[0] = './show_info'
argv[1] = '-a'

FWIW, OpenBSD 5.5, NetBSD 6.1, and FreeBSD 9.0 all behave like Linux. Which was surprising because I could have sworn that either FreeBSD or NetBSD (or both) would field-split the remainder of the shebang line.


to post comments

How programs get run

Posted Jan 29, 2015 9:50 UTC (Thu) by drysdale (guest, #95971) [Link] (2 responses)

Thanks for the clarification & comparisons with other OSes -- I should have made clear that the bundling together of arguments into argv[1] means that multiple interpreter arguments basically won't work.

How programs get run

Posted Jan 29, 2015 17:17 UTC (Thu) by vonbrand (subscriber, #4458) [Link]

Please do update the article with this information. It is definitely one to bookmark.

How programs get run

Posted Jan 29, 2015 21:13 UTC (Thu) by wahern (subscriber, #37304) [Link]

FWIW, Linux and OS X are the only systems I'm aware of that permit recursive shebang execution. Some systems, like Free/Net/OpenBSD, will recursively search for the binary interpreter, but they won't stack the paths of the intervening interpreters. Instead the binary interpreter is only passed the original file path. (And any trailing shebang arguments in the scripts seem to get dropped altogether.)

That's not germane to how Linux executes binaries. But I have a feeling this page might end up near the top of the Google results (as all good LWN articles do) for shebang-related queries, so it's worth putting out there.

Because shells parse scripts line-by-line, if you can come up with a construct that is both valid shell code and valid code in your other language, you can mix interpreters portably. For example, the following is a mixed shell/Lua script which will locate a Lua interpreter. Because both the locations _and_ interpreter names of Lua differ across systems, even across Linux distributions, and even for the same version of Lua, you can't use the #!/usr/bin/env trick to run your Lua scripts and expect it to work even remotely reliably.

#!/bin/sh
_=[[ # variable assignment in shell, beginning of long string in Lua
IFS=:
for D in ${PATH:-$(command -p getconf PATH)}; do
    for F in ${D}/lua*; do
         # check if it's our preferred version
        if ...; then
            exec "${F}" "$0" "$@"
        fi
    done
done
printf "%s: unable to locate Lua interpreter\n" "${0##*/}" >&2
exit 1
]]
-- begin pure Lua code
print(_VERSION)

I recently published a script, runlua, for portable execution of Lua scripts, which is why all of this stuff is still fresh in my mind.

How programs get run

Posted Jan 29, 2015 21:56 UTC (Thu) by peter-b (guest, #66996) [Link]

GNU Guile has a special "meta switch" which instructs the interpreter to interpret the first few lines of the file -- up to a line containing only "!#" -- as arguments to the interpreter rather than as source code. It seems to work quite well:
#!/usr/local/bin/guile \
-e main -s
!#
(define (main args)
        (map (lambda (arg) (display arg) (display " "))
             (cdr args))
        (newline))

How programs get run

Posted Feb 5, 2015 8:50 UTC (Thu) by grawity (subscriber, #80596) [Link]

Sven Mascheck's website has loads of information regarding OS differences in #! handling.

How programs get run

Posted Nov 25, 2019 9:34 UTC (Mon) by Profpatsch (guest, #130533) [Link] (1 responses)

This here is the bible of shebang interpretations: https://www.in-ulm.de/~mascheck/various/shebang/

I have to look through it surprisingly often.

How programs get run

Posted Nov 25, 2019 9:35 UTC (Mon) by Profpatsch (guest, #130533) [Link]

Ah, grawity beat me to it (by about 4 years).


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds