LWN.net Logo

LCA: Lessons from 30 years of Sendmail

LCA: Lessons from 30 years of Sendmail

Posted Feb 5, 2011 5:43 UTC (Sat) by daglwn (subscriber, #65432)
In reply to: LCA: Lessons from 30 years of Sendmail by cmccabe
Parent article: LCA: Lessons from 30 years of Sendmail

> It's not a ridiculous question to ask "will my compile times be longer if
> I port project X to C++." The answer is almost certainly yes.

That's simply not credible. If it was a port from C, there is nothing a C++ compiler would do differently than a C compiler that would greatly increase compile time.

> Every large C++ project I've ever worked on has had long compilation
> times.

That's a valid observation but it doesn't indicate any general statements can be made.

> It's a consequence of the design of the language.

That does not follow.

> Every file in a C++ project must do the same work over and over for
> #include directives. A single #define could change the meaning of
> everything.

No, that would be a violation of the ODR. Structures and definitions cannot change after they've been used.

> There is no room for C++ in this world because C++ itself is a layering
> violation.

Your entire argument makes no sense. There is nothing in C or C++ that prevents or encourages any particular design. One can write well defined modules and interfaces in both languages. One can write poorly structured code in both languages.

But C++ provides safety mechanisms that are simply not available in C. RAII is an example.


(Log in to post comments)

LCA: Lessons from 30 years of Sendmail

Posted Feb 5, 2011 21:27 UTC (Sat) by cmccabe (guest, #60281) [Link]

> No, that would be a violation of the ODR. Structures and definitions
> cannot change after they've been used.

It's only a violation of the ODR if the objects that are defined differently have global linkage and all of them are not weak symbols.

There are actually a lot of macros that change the behavior of standard headers. _GNU_SOURCE, _BSD_SOURCE, and _SVID_SOURCE are three popular ones.

You seem to be confused about how include files work in C and C++. The way they work is that each translation unit (that's .cpp file to you) has to scan through all the files included by that unit, recursively. There are no shortcuts and the compiler cannot cache this work.

The reason why I said it was O(n^n) is because n^n is the upper bound on the time complexity. Remember that you can include .c or .cpp files. In reality, most projects compilation times will grow slower than this. However, it's still exponential in the number of files and the compile times seen by real-world projects like WebKit reflect this.

> One can write well defined
> modules and interfaces in both languages. One can write poorly structured
> code in both languages.

I agree. A good programmer can write good code in any language. A bad one can write Vogon poetry in any language.

There's a lot of projects I like and respect that use C++. LLVM, OpenCV, Ceph, WebKit, and a lot of others. C++ will be around for a long time. For new projects, however, I would encourage people to look at newer languages like Google Go. Progress hasn't stood still and we have learned some things since the early nineties. I swear!

LCA: Lessons from 30 years of Sendmail

Posted Feb 7, 2011 19:13 UTC (Mon) by nix (subscriber, #2304) [Link]

You seem to be confused about how include files work in C and C++. The way they work is that each translation unit (that's .cpp file to you) has to scan through all the files included by that unit, recursively. There are no shortcuts and the compiler cannot cache this work.
Except that there are shortcuts and GCC does cache this work, and has for more than fifteen years. (e.g. you can skip even opening files more than once if they are entirely contained in include guards and the guards are not #undefed.)

LCA: Lessons from 30 years of Sendmail

Posted Feb 8, 2011 0:26 UTC (Tue) by cmccabe (guest, #60281) [Link]

Sigh. I knew I was going to get some grief when I said "there are no shortcuts." :)

It depends on what you call a shortcut I guess. The header guard optimization is good, but the process as a whole is still O(n^n). Doing slightly more efficient things with file descriptors can't change that.

LCA: Lessons from 30 years of Sendmail

Posted Feb 8, 2011 18:27 UTC (Tue) by nix (subscriber, #2304) [Link]

Um, for 'slightly more efficient things with file descriptors' substitute 'almost always avoid parsing the vast majority of headers more than once'.

The exponential explosion you refer to simply does not happen with real code. And if header parsing is slow, GCC supports precompiled headers on common platforms to speed things up. (Yes, you may have to restructure your headers a bit to use them, but if you're compiling slow enough that you need this feature, that's a small cost.)

LCA: Lessons from 30 years of Sendmail

Posted Feb 11, 2011 9:33 UTC (Fri) by cmccabe (guest, #60281) [Link]

See my comment below. Basically, private data members of a class also need to be #included in that class' header file. So you *cannot* "avoid parsing the vast majority of headers more than once."

Under ideal conditions, C++ compilation is slow. If you add even a few non-ideal conditions, like programmers who love to define functions in header files "for performance", extensive use of templates, auto-generated anything, or unecessary cross-module dependencies, it becomes positively glacial.

Unfortunately real-world projects tend to have some or all of these conditions. I'm too lazy to find the reference now, but Google's C++ compile times are said to be measured in hours. And those guys read Effective C++ and know their stuff.

Precompiled headers sound helpful, but only for headers you are including from external libraries. Maybe they would be useful for something like QT? I haven't used precompiled headers.

LCA: Lessons from 30 years of Sendmail

Posted Feb 19, 2011 0:24 UTC (Sat) by nix (subscriber, #2304) [Link]

Er, when I said 'more than once' I said 'more than once per translation unit', and this is almost universal. This reduces your claimed O(n^2) to, uh, O(nm) where n is the number of translation units and m is the number of headers.

Precompiled headers are useful in any project where you have one great big header that #includes a lot of stuff. This is extremely common.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds