GNU grep's new features (Linux.com)

Posted Jun 1, 2006 19:16 UTC (Thu) by elanthis (guest, #6227)
Parent article: GNU grep's new features (Linux.com)

These are definitely useful. Doing the same thing generally takes a rather more complex sed line, grep/sed combination, or grep/find combination.

On the other hand.. if it _was_ already possible before, do we need it all in one utility? GNU is rather notorious for feature-bloating its software.

My general opinion though is that this is a good thing, and I'm quit happy to have these features available in grep. Anything that saves me some typing and reduces the chances of making a mistake eases my life.

do we need it all in on utility? yes please!

Posted Jun 1, 2006 20:53 UTC (Thu) by coriordan (guest, #7544) [Link] (17 responses)

I remember hearing a BSD guy complaining about GNU. He didn't like the number of extra options that GNU ls has. As if it proved his argument, he pointed out that BSD ls was only 18kb while GNU ls was a massive 75kb. Who cares if one tiny binary is 5 times the size of the other tiny binary? It's not "bloat", it's the stuff I want so that I can to get my work done easier.

It also reminds me of the Real men use 'ed' mail.

Speaking of nifty free software projects, I put my 10 favourite software tools online today.

do we need it all in on utility? yes please!

Posted Jun 2, 2006 0:42 UTC (Fri) by flewellyn (subscriber, #5047) [Link] (7 responses)

Emacs, of course, is everyone's favorite target for complaining about bloat. Remember "Eight
Megs And Constantly Swapping"? But the funny thing is, Emacs has remained relatively constant
in memory usage over the years, so that nowadays, that eight megs does not look so bad.
Compare it to Eclipse, or Mozilla, or even some individual component programs of GNOME or
KDE, and, well...gee. Emacs suddenly doesn't look so bloated.

eight megs and constantly swapping

Posted Jun 2, 2006 5:27 UTC (Fri) by xoddam (subscriber, #2322) [Link]

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND  
 5915 jmaddox   15   0 14192  11m 8752 S  0.0  1.1   0:56.04 emacs

Fourteen, eight ... I guess libc used to be smaller :-)

do we need it all in on utility? yes please!

Posted Jun 2, 2006 8:16 UTC (Fri) by davidw (guest, #947) [Link] (4 responses)

Yeah, I still can't quite figure out what Eclipse *does* with all that memory. It's an order of magnitude bigger than a fully-loaded Emacs session.

do we need it all in on utility? yes please!

Posted Jun 2, 2006 13:29 UTC (Fri) by micampe (guest, #4384) [Link] (3 responses)

In short, it builds a model of your code, to provide smart completion suggestions, real-time error checking, refactoring and more.

do we need it all in on utility? yes please!

Posted Jun 2, 2006 17:06 UTC (Fri) by carcassonne (guest, #31569) [Link] (2 responses)

In short, it builds a model of your code, to provide smart completion suggestions, real-time error checking, refactoring and more.

I just got the cedet kit for emacs and although I've not had the time to configure it yet, it provides (the semantic part of it) 'intellisense' completion (that is, based on the code, not the regular completion emacs has based on what you type).

And it does not seem to add any more memory use to emacs.

For refactoring, there's xref for emacs, but that's a commercial product.

As for real-time error checking, I dunno. I guess I'd prefer the actual compiler to check this out. It's pretty good at that.

do we need it all in on utility? yes please!

Posted Jun 2, 2006 22:08 UTC (Fri) by micampe (guest, #4384) [Link] (1 responses)

I just got the cedet kit for emacs and although I've not had the time to configure it yet, it provides (the semantic part of it) 'intellisense' completion (that is, based on the code, not the regular completion emacs has based on what you type).

There's more than just intellisense. Intellisense is so 90s.

If that does it for you, fine by me, but I prefer using a tool with an user interface designed in this century, doesn't require configuration to be used and three years to master (and I still doubt that tool can be considered on par with Eclipse JDT).

As for real-time error checking, I dunno. I guess I'd prefer the actual compiler to check this out. It's pretty good at that.

It is the actual compiler doing that.

Eclipse (like most other IDEs) is much more powerful than a text editor launching a compiler.

do we need it all in on utility? yes please!

Posted Jun 8, 2006 7:38 UTC (Thu) by lysse (guest, #3190) [Link]

"Intellisense is so 90s."

Likewise, I guess writing code that doesn't assume infinite memory and CPU cycles and actually takes care to conserve its resource usage is so 80s...

"I prefer using a tool with an user interface designed in this century"

Whereas a goodly number of developers prefer using a tool whose UI has been under constant development for 20 years and is customisable to a degree where it works with their muscle memory. Good engineering simply doesn't become obsolete, and I'd rather have an intuitive command set than a pretty picture any day.

do we need it all in on utility? yes please!

Posted Jun 2, 2006 8:47 UTC (Fri) by nix (subscriber, #2304) [Link]

XEmacs has grown, because of all the packages: there's 120Mb of Lisp alone shipped with it.

But of course not all of that is *loaded* at once :)

do we need it all in on utility? yes please!

Posted Jun 2, 2006 8:56 UTC (Fri) by sitaram (guest, #5959) [Link] (2 responses)

I went to that link (10 fav software tools), and I noticed that "gpg -c" is a favourite!

I prefer "openssl enc" to "gpg -c" -- it's almost an order of magnitude faster sometimes! Here're some speed ratios from my machine:

Enc Dec Enc+Dec
Blowfish 9.39 6.95 8.2
AES256 6.09 4.05 5.08
CAST5 4.87 2.67 3.7

You may want to evaluate it yourself and consider this if you are doing this often or to large files.

do we need it all in on utility? yes please!

Posted Jun 2, 2006 8:57 UTC (Fri) by sitaram (guest, #5959) [Link]

Rats! Forgot the formatting...

               Enc     Dec     Enc+Dec
Blowfish       9.39    6.95    8.2
AES256         6.09    4.05    5.08
CAST5          4.87    2.67    3.7

why "gpg -c"

Posted Jun 2, 2006 12:40 UTC (Fri) by coriordan (guest, #7544) [Link]

I like the 'gpg -c' functionality because I think most people don't realise that GnuPG can do that. I reckon most people think that you can't only use it for public/private key encryption, and that's too complicated for some (at least as a first step).

You know what needs to be done? 'tar' needs to be given support for 'gpg -c', just like it supports gzip and bzip2 compression.

Of course password stuff isn't as secure as public key encryption, but it's great when you don't want your local proxy to keep an unencrypted copy of something, and it's good to ecrypt something and tell the password to the recipient when you meet them in person.

I haven't heard of people doing this with 'openssl enc' - I'll look into it.

trust, GCC, and Ken Thompson's compiler trojan thesis

Posted Jun 2, 2006 14:27 UTC (Fri) by jabby (guest, #2648) [Link] (3 responses)

I read your 10 favorite tools in which you refer to the Ken Thompson article on the C compiler/login trojan in the context of GCC. You seem to be missing his point, though...

Ken makes this very clear: "No amount of source-level verification or scrutiny will protect you from using untrusted code." GCC is a Free compiler for C, written in C and is thus just as vulnerable to this hack as any other self-referential code.

Anyone could download GCC, follow the steps that Ken outlined and eventually install a version on their system that contains the trojan but with no trace in the source code. If that person were an insider at the place that compiles the binaries for your GNU/Linux distribution of choice, it wouldn't matter that you had access to the source code. Once you accept the binary from that trusted source, you are vulnerable. If you were to recompile the compiler from pristine source code with the trojaned gcc binary, you would still get a trojaned gcc!

Admittedly, having an entirely free system helps tremendously in raising the bar of trust, but depending on a wide and farflung community also means casting a wide net of trust. I trust the Free Software community, but the four freedoms do not prevent this particular hack. It all comes down to trust.

trust, GCC, and Ken Thompson's compiler trojan thesis

Posted Jun 2, 2006 16:11 UTC (Fri) by nix (subscriber, #2304) [Link]

The bar is raised yet more if you initially cross-compile your bootstrap GCC using a completely different compiler, preferably on a different architecture.

It's still not infinitely high, but it's higher.

ok, the longer version then

Posted Jun 2, 2006 20:17 UTC (Fri) by coriordan (guest, #7544) [Link] (1 responses)

I agree with Ken that no one can verify all the code, but access to the source is better than no access to the source, and knowing that everyone has access to the source, and can analyse it in any way they want, and that if one person finds a trojan, they can remove it and publish the patch, is probably as good as it gets.

It's not perfect, and some trust is still required, but that is a fact of life and cannot be avoided. All we can do is aim for "as good as it gets" - and that involves the four freedoms.

When I was writing that paragraph in my blog, I wondered if I should go into the explanation, but I decided against because it was supposed to be a paragraph about GCC.

ok, the longer version then

Posted Jun 2, 2006 20:32 UTC (Fri) by jabby (guest, #2648) [Link]

I agree. Access to source is a huge advantage. And keeping source code in a version control system goes a long way toward monitoring changes and preventing even the fully baked Ken Thompson exploit.

And your paragraph in the context of GCC is not incorrect. It's absolutely true that Free Software helps to prevent source-borne trojans. Only in the context of the whole ACM article does this argument fall short and, as you say, that was not your aim in your short "top 10" list.

do we need it all in on utility? yes please!

Posted Jun 4, 2006 2:55 UTC (Sun) by vonbrand (subscriber, #4458) [Link] (1 responses)

What happened to the Unix philosophy of small tools that do one thing, and do it well, that can be combined endlessly?

Sure, it is nice to have all this in one package, but on the other hand it is infuriating to see that the "same functionality" (regular expression patterns, processing directories recursively, the list is seemingly endless) are implemented differently in several tools, and what would be handy to combine with other tools sometimes can't be done as it is bound to a specific program.

purity vs. functionality

Posted Jun 5, 2006 12:49 UTC (Mon) by coriordan (guest, #7544) [Link]

Sometimes there is a conflict between the goals of design purity, and giving the user what they expect. I don't think either goal is perfect, so decisions or compromises have to be made.

It's worth noting that that the problem with multiple implementations is not as big a problem as "design purity" people sometimes claim it is. Factoring the regex code out into a library and standardising on that (as GNU usually does), greatly reduces problems such as inconsistency.

GNU grep's new features (Linux.com)

Posted Jun 1, 2006 20:53 UTC (Thu) by nix (subscriber, #2304) [Link] (6 responses)

--only-matching's code is 24 lines long. Bloat? Not hardly.

As for grep -P, it's implemented using PCRE, so no bloat at all.

(Of course there will never be a pgrep or `prep' (ugh) as suggested in the article: egrep and fgrep are *already* obsolete, with grep -E and grep -F their preferred forms. Introducing more obsoleteness on top seems... peculiar.)

GNU grep's new features (Linux.com)

Posted Jun 2, 2006 12:00 UTC (Fri) by jond (subscriber, #37669) [Link] (5 responses)

egrep and fgrep are just shell scripts that call grep -E and grep -F on my system (sarge). I found this out the hard way, when I was using fgrep in a busy loop.

GNU grep's new features (Linux.com)

Posted Jun 2, 2006 16:14 UTC (Fri) by nix (subscriber, #2304) [Link] (4 responses)

This is true in the upstream distro, and has been true since at least 2002. (Before that they were three separate binaries with different behaviours; having a single program that acts differently depending on its name is a violation of the GNU Coding Standards, so it wasn't implemented like that except for a brief period in 2002.)

GNU grep's new features (Linux.com)

Posted Jun 4, 2006 2:21 UTC (Sun) by vonbrand (subscriber, #4458) [Link] (3 responses)

...having a single program that acts differently depending on its name is a violation of the GNU Coding Standards, ...

Yet another reason to consider the coding standard to be braindamaged.

GNU grep's new features (Linux.com)

Posted Jun 5, 2006 9:56 UTC (Mon) by nix (subscriber, #2304) [Link] (2 responses)

The reasoning is entirely sensible: it is counterintuitive to have a program act differently simply because you mv'ed it to a different name. Among other things, having grep behave differently simply because you called it egrep *bans* you from making egrep a wrapper script, unless there is some other way to get at egrep's behaviour.

POSIX agreed: hence grep -E and grep -F, and the deprecation of look-at-argv[0] programs.

GNU grep's new features (Linux.com)

Posted Jun 8, 2006 7:41 UTC (Thu) by lysse (guest, #3190) [Link] (1 responses)

Where does that leave BusyBox?

GNU grep's new features (Linux.com)

Posted Jun 8, 2006 12:34 UTC (Thu) by nix (subscriber, #2304) [Link]

busybox isn't specified by POSIX nor is it a GNU project (nor does it even slightly conform to the GNU Coding Standards).

busybox is also a bit of a special case, in that 'size is everything', so the heaps-of-symlinks approach actually makes sense.

(But you don't necessarily need any of the symlinks. If you're running everything from a busybox shell, you can tell it to find busybox commands first regardless of the absence of symlinks.)

The BSDs have a tool called crunchgen which smashes a bunch of programs into one which conditionalizes off argv[0] in much the same way. (Except that busybox is smaller and doesn't penalize the rest of the system by forcing the *default* tools to be little featureless ones.)