LWN.net Logo

LCA: Disintermediating distributions

LCA: Disintermediating distributions

Posted Feb 6, 2008 23:26 UTC (Wed) by vmole (subscriber, #111)
In reply to: LCA: Disintermediating distributions by stevenj
Parent article: LCA: Disintermediating distributions

What makes a tool like autoconf difficult is that the underlying portability problem it solves is quite hard and intrinsically complicated.

Actually, the problems that autoconf "solves" are the *easy* parts of portability. Actually dealing with the semantic differences between OS's is the hard part. The worst part is that autoconf encourages a really bad programming style of ifdefs and the use of the 6 slightly OS-specific versions of foofunc() rather than just following the dang standard. Henry Spencer told us why that was a bad idea [warning: PDF] 15 years ago, but the GNU people failed to heed him.


(Log in to post comments)

LCA: Disintermediating distributions

Posted Feb 6, 2008 23:42 UTC (Wed) by vapier (subscriber, #15768) [Link]

the void you describe is largely addressed by things like the gnulib project.  it takes care
of all the OS-specific issues by checking to see if the function in question is usable on the
host system.  if it isnt, the gnulib version is provided.  thus the code that *you* write is
able to freely assume that function in question is available.

as for encouraging ifdef's, there are simply some things you cannot solve otherwise.  how do
you propose people integrate optional support for addon libraries ?  perhaps you can support a
wide range of graphic formats, but only if the external library is available ... yes, you can
write autotool code to largely avoid this (make most pieces standalone files that are
optionally compiled), but it doesnt make the issue magically go away due to the nature of the
code (assuming C/C++ here).

LCA: Disintermediating distributions

Posted Feb 7, 2008 0:01 UTC (Thu) by vmole (subscriber, #111) [Link]

No, you can't get rid of them entirely, and Mr. Spencer doesn't claim you can. They were added to the language for a reason, after all. But all too much autoconf code has stuff like this:

#ifdef HAVE_INDEX
p = index(s, c)
#elif HAVE_STRCHR
p = strchr(s, c)
#endif

Yeah, that's a really old example, but while the names of the functions change, the style doesn't.

And while I'm at it, why do I still have to sit through messages like

Checking for C89...okay
Checking for stdio.h...found
Checking for strcpy()...found
...

Hey, if it's C89, then ALL THOSE FUNCTIONS AND HEADER FILES ARE THERE. If not, the implementation is broken WAY beyond what autoconf can solve.

And, not to go off on a rant (too late!), why do I have to sit through 3 minutes of autoconf masturbation just to go to a 3 second compile of 300 lines of standard C?

Yes, I know that isn't really autoconf's "fault". It can be used in reasonable ways. But the autoconf culture encourages such bad choices, because 95% of the users don't understand it and just copy other peoples' bad choices.

Feh.

LCA: Disintermediating distributions

Posted Feb 7, 2008 0:28 UTC (Thu) by stevenj (subscriber, #421) [Link]

In my experience with developing many actual programs using autoconf, the checks in the configure script are a direct consequence of porting to some platform or another. ("Crap, MinGW doesn't support gettimeofday, we'll need to check for it and whatever alternative is available.")

Most new code these days assumes C89 support at least, and doesn't check any more for functions like "strcpy". On the other hand, as recently as a few years ago the default Solaris compiler didn't support "const" by default unless you jumped through hoops, and hence a lot of programs had checks for things like that.

It's true that, at some point, older programs might be able to remove checks for things that only break on ancient platforms, but it can be really hard to decide exactly when to remove such checks. It's safer to leave them in and wait the extra seconds than to break the build on an extant architecture.

Regarding the time for configure to run, its true that the configure script is often slower these days than compiling the program. For large programs, parallel make speeds up things a lot, while parallelizing configure tests is extremely tricky (although it has been discussed a lot by the autoconf developers). For small projects, the configure script may call the compiler more times than the Makefile itself, but long experience has shown that actually trying to compile something in the configure script is by far the most reliable way of performing a feature test. In any case, the number of times that one has to run "configure" is small (most people don't run it at all anymore, since prepackaged binaries from distros are ubiquitous), and it's better to sacrifice a little build time than to sacrifice robustness.

LCA: Disintermediating distributions

Posted Feb 7, 2008 0:52 UTC (Thu) by vmole (subscriber, #111) [Link]

In my experience with developing many actual programs using autoconf, the checks in the configure script are a direct consequence of porting to some platform or another.

My experience says otherwise. Things like checking C++ related stuff (and even Fortran!) in projects that are pure C. And the sucessful checks for C89 followed by checks for individual C89 standardized functions is widespread, and almost certainly because there's some autoconf macro that does all of it. It's just stupid and annoying.

LCA: Disintermediating distributions

Posted Feb 7, 2008 1:40 UTC (Thu) by stevenj (subscriber, #421) [Link]

The checks for C++ and Fortran in C-only programs was due to a libtool 1.5 bug that has since been fixed, IIRC.

And it's true that AC_PROG_CC (the check for a C compiler) automatically calls a check to make sure the compiler is in C89 (ISO C90) mode, and if not it tries to find an option to put the compiler in C89 mode. Reliably checking whether the compiler is in C89 mode involves, among other things, checks for stdio.h. These checks were still required fairly recently — e.g. on AIX circa 2003 you had to use "-qlanglvl=extc89" or it didn't handle macro parameters in completely ANSI fashion, and on HPUX the compiler was non-ANSI by default until at least the late 90s. And many of these systems were still running long, long after their release dates (e.g. I heard from Solaris users with a compiler that defaulted non-ANSI as recently as a few years ago).

When autoconf has a default check, there's usually a good reason for it; most developers don't have experience on a wide enough variety of platforms to appreciate this. Try to get a patch accepted into autoconf sometime and you'll see what they have to deal with and why so many complaints about autoconf are founded in ignorance.

(And if your configure time is dominated by the default check for an ANSI compiler, your project is pretty small indeed. As I said, in my experience most configure scripts times for projects of decent size are dominated by checks that the programmers explicitly included in response to portability problems --- or by checks that are invoked by those checks like the case above, which the autoconf developers put in there for good reason. And, really, it's not like this is a huge problem—how often do you run configure scripts, and how many seconds would you be willing to shave off at the risk of losing portability to some machine?)

LCA: Disintermediating distributions

Posted Feb 7, 2008 7:51 UTC (Thu) by aleXXX (subscriber, #2742) [Link]

> In my experience with developing many actual programs using autoconf,
> the checks in the configure script are a direct consequence of porting
> to some platform or another. ("Crap, MinGW doesn't support
> gettimeofday, we'll need to check for it and whatever alternative is
> available.")

Yes, basically the configure checks are necessary.
Autoconf itself wasn't so bad. But then additionally you have to 
understand how it works together with automake and libtool. This is what 
made it really hard for me.

And I'm sure this is also the reason why many programs run much more 
configure checks than they actually need. They take the autotools stuff 
from some existing project and just change it so that it can build their 
program. With this approach you end up with all the configure checks the 
original script had, and maybe later on you have to add some more. IMO it 
is simply too hard to build a simple program using autotools. Who wants 
to learn that just to build hello world ?

Using other build systems, as e.g. cmake, makes this trivial:

add_program(helloworld hello.c)

That's all, cmake will do a few checks to find the compiler, check that 
it actually works and figure out if it's a 32 or 64 bit compiler, and 
that's all.
I guess for Scons it's similar.

Also autotools don't doesn't push you to use some modular style, at least 
in KDE we ended up with a few huge scripts which did a lot of magic. E.g. 
CMake strongly encourages a modular system (while it is of course still 
possible to throw everything into one file, but then you intentionally 
work around its features).

Alex

LCA: Disintermediating distributions

Posted Feb 7, 2008 20:04 UTC (Thu) by vapier (subscriber, #15768) [Link]

the issues you raise i dont really see being "solved" or even really addressed any differently
with other build systems (cmake/scons/whatever).  a build system cannot really compensate for
a coder's inability to write clean code.  what do you see in other build systems that
encourages different style of coding ?

the example you cite could easily be relegated to a header file without making the real code
messy.  while it is an older example, there are many similar situations that could generally
be solved the same way: take care of OS-differences in one location (sep source or header
file) and keep everything else clean.

LCA: Disintermediating distributions

Posted Feb 7, 2008 21:22 UTC (Thu) by vmole (subscriber, #111) [Link]

the example you cite could easily be relegated to a header file without making the real code messy.

Bingo. But they're not. Why not? Because people just copy the way previous programmers did things. Maybe the autotools problem is that there are just way too many bad examples out there.

As for other build systems, all I can do is quote the comment immediately above yours, from Alex, apparently a KDE developer: "CMake strongly encourages a modular system (while it is of corse still possible to throw everything into one file, but then you intentionally work around its features)." Autoconf does makes bad usage patterns as easy (or even easier, on first glance) as good ones.

LCA: Disintermediating distributions

Posted Feb 6, 2008 23:55 UTC (Wed) by stevenj (subscriber, #421) [Link]

There are lots of cases where just following the dang standard is not practical, or not sufficient. For one thing, not all platforms implement the dang standard, and if you don't want to fail completely when this happens you need some workaround. For another thing, in some applications it's extremely useful to support functionality that may not be available on all platforms—for example, SSE instructions or high-resolution timers.

Also, a lot of what autoconf deals with is checking for things in the build environment which essential but not standardized, such as how to link shared libraries. e.g. POSIX threads and OpenMP are two examples of formally standardized libraries that you can depend on, but each compiler and OS has its own command to link with them (see here and here). Or suppose you want to use features from the 1999 ANSI C standard, which has been out for 9 years now but compilers (including gcc) still make you jump through hoops to enable support for it, and of course each compiler has its own hoop (which autoconf will detect).

Also, free-software projects often build upon other projects so as to avoid re-inventing the wheel, and there are lots of extremely useful libraries (from GNU readline to HDF5 to LAPACK to Expat to Boost to...I'm just picking things at random) that are not standardized by any standards body. Part of autoconf's job is to help you detect whether such a library is present and contains the function you want (it may not, e.g. if it is the wrong version).

And heaven help you if you want to link together multiple languages, e.g. you have a C++ program and you want to link Fortran numerical libraries (e.g. LAPACK), without autoconf to help you detect how to do it with your compiler (each one has a different incantation).

Also...well, just look at the autoconf documentation for the variety of kinds of things one has to check for. As I said, there's a reason for its popularity, which extends far beyond "the GNU people"...it fills a real need. People who don't understand what it does are doomed to reinvent it badly.

LCA: Disintermediating distributions

Posted Feb 7, 2008 0:12 UTC (Thu) by vmole (subscriber, #111) [Link]

I agree that those are all problems that autoconf/libtool/etc. claim to solve. My experience (which is extensive) is that they don't reliably work, and I spend a *LOT* more time figuring out the problem and fixing it than I did with packages that simply ask me to set a few variables in the beginning of the Makefile.

I'd guess that if all you ever work with is Linux, BSD, and possibly Solaris, these tools do work, mostly. OTOH, those are the really easy ones.

LCA: Disintermediating distributions

Posted Feb 7, 2008 0:39 UTC (Thu) by stevenj (subscriber, #421) [Link]

The autoconf developers go to great lengths to support systems beyond Linux and BSD, and it's simply untrue that the tools break on other systems. I personally work on software that has run for years on everything from HPUX to Tru64 to UNICOS to AIX to MinGW using the autotools.

It's true that many people don't know what to do when configure fails. The usual mistake is to start poring over the configure script (which is essentially object code) rather than RTFM. configure --help gives a clue: most problems can be solved by setting an environment variable on the command line with configure LDFLAGS=... or whatever. The most common problems in my experience are due to libraries installed in nonstandard locations, and in this case there's simply no way around requiring the user to tell you where things are (which in autoconf is done by setting LDFLAGS and CPPFLAGS).

It's also true that some programmers misuse autoconf. e.g. even though the autoconf manual strenuously recommends doing feature tests by actually compiling and linking things, autoconf also provides a macro to get a canonical target name (e.g. i386-linux-gnu) and some programmers take the shortcut of explicitly testing this when they shouldn't. The difficulty is that if you are testing for a feature that does not have a built-in autoconf test, writing a portable feature test is hard, especially if you don't have many platforms to test on—but again, I think this is somewhat intrinsic to the problem and is not really autoconf's fault.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.