Shell Scripts

Posted May 3, 2010 19:11 UTC (Mon) by nescafe (subscriber, #45063)
In reply to: Shell Scripts by HelloWorld
Parent article: Poettering: Rethinking PID 1

Eh, most of the problems with shell scripts is that it is very easy for people to write ones that suck.

Shell Scripts

Posted May 3, 2010 19:20 UTC (Mon) by HelloWorld (guest, #56129) [Link] (12 responses)

It's easy to write stuff that sucks in any language that allows you to do something useful. The problem with shell scripts is that writing one that doesn't suck is *really* hard (of course, this varies with your definition of "suck". I, for one, consider any shell script sucky that fails to handle file names that start with a - or contain newline or other whitespace characters)

Shell Scripts

Posted May 3, 2010 20:20 UTC (Mon) by nescafe (subscriber, #45063) [Link] (11 responses)

Indeed, and a lot of that is that even experienced users and programmers vague and misleading ideas of what constitutes proper shell scripting for their environment. I have a habit of going through and fixing them when they are horrible.

People do things like

for foo in 'find /path -name someglob'
if [ `cat $foo |grep ^bar &>/dev/null; echo $?` == 0 ]; then
echo `cat $foo |grep ^bar |awk '{print $2}'`; fi
done

over several thousand files and then complain about how slow shell code is.

Seriously.

Shell Scripts

Posted May 3, 2010 21:00 UTC (Mon) by HelloWorld (guest, #56129) [Link] (2 responses)

Have you actually seen that in someone's code?

Shell Scripts

Posted May 6, 2010 12:27 UTC (Thu) by nescafe (subscriber, #45063) [Link]

Not that exact snippet, but I see things like that all over the place.

Shell Scripts

Posted May 10, 2010 17:36 UTC (Mon) by jschrod (subscriber, #1646) [Link]

At some of my customers, I see worse shell code regularly. But luckily they write even less Perl code.

Shell Scripts

Posted May 5, 2010 13:54 UTC (Wed) by paulj (subscriber, #341) [Link] (7 responses)

It's a real shame people don't know how to use AWK properly. It's a fairly capable little language. One of the common abuses is piping grep to AWK - as AWK applies regexes itself to every line[1]. Basically, if we can assume input tends not to be huge or that most the input will be acted on, then whenever you see:

grep XYZ | awk ... '{ ... }'

You'd be much better off with:

awk ... '/XYZ/ { .... }'

E.g. Your shell example could be done with:

find /path -name someglob | xargs awk '/^bar/ { print $2 }'

or using GNU find's built-in xargs-ish feature (when was that added?):

find /path -name someglob -exec awk '/^bar/ { print $2 }' {} +

This is meant more for the peanut gallery than for you ;) - I was expecting there'd be a rush to offer more optimal one-liners, strangely there hasn't been. ;)

1. Though, as Padraig Brady has shown me, beyond a certain size of file, there is a benefit to using grep to pre-filter input if you're discarding a sufficient amount of that input, as grep is much faster at processing each line than AWK.

Shell Scripts

Posted May 5, 2010 14:05 UTC (Wed) by johill (subscriber, #25196) [Link] (1 responses)

sed is faster than grep even for just grepping, at least last I checked it was.

sed 's/foo/\0/;t;d'

Shell Scripts

Posted May 5, 2010 16:30 UTC (Wed) by martinfick (subscriber, #4455) [Link]

The beauty of running grep (or sed if you are so inclined) separately on large data sets is the inherent parallelism possible due to using unix pipes. This a feature often overlooked by modern programming techniques, the creators of unix made an elegant simple (much less error/deadlock prone than most others) parallelism mechanism long ago! With two cores, each one of those pipe commands can easily run in parallel.

Shell Scripts

Posted May 5, 2010 14:46 UTC (Wed) by k8to (guest, #15413) [Link] (1 responses)

The exec switch was added around SysV timeframe. Every find implementation has the thing.

Shell Scripts

Posted May 5, 2010 15:07 UTC (Wed) by paulj (subscriber, #341) [Link]

Look carefully, the exec has a + at the end instead of \;. I've since noticed the man page says it's a POSIX specified feature, and was added in 4.2.12. Seems FreeBSD has the feature since at least FBSD 5.0 (judging by when it appears in the man pages).

Shell Scripts

Posted May 5, 2010 16:28 UTC (Wed) by fredi@lwn (subscriber, #65912) [Link] (2 responses)

Indeed, awk sometimes is better than the couple:

find | grep | xargs cut ...

or similar. Though the -exec on your last example for what i recall is slower than:

find /foo -name $GLOB -print0 | xargs -0 SOMECOMMAND

That because with -exec you start on each found entry another process while xargs passes all entriess to the same process if they fit in the command line max length. Hope i gave the idea & sorry for my bad english.

Shell Scripts

Posted May 5, 2010 18:05 UTC (Wed) by paulj (subscriber, #341) [Link] (1 responses)

Yes, you're right about the standard 'find ... -exec ... {} \;'. That's why I said "xarg-ish" and used the (apparently) little known 'find .. -exec ... {} +' form of the command. Note carefully the + there, I only discovered it today myself.

Shell Scripts

Posted May 6, 2010 15:48 UTC (Thu) by fredi@lwn (subscriber, #65912) [Link]

Didnt knowed this one, really useful! Thanks for the hint!