Converting GCC to C++
| From: | Ian Lance Taylor <iant-AT-google.com> | |
| To: | gcc-AT-gcc.gnu.org | |
| Subject: | gcc-in-cxx branch created | |
| Date: | Tue, 17 Jun 2008 23:01:35 -0700 | |
| Message-ID: | <m363s746yo.fsf@google.com> | |
| Cc: | gcc-patches-AT-gcc.gnu.org | |
| Archive‑link: | Article |
As I promised at the summit today, I have created the branch gcc-in-cxx (I originally said gcc-in-c++, but I decided that it was better to avoid possible meta-characters). The goal of this branch is to develop a version of gcc which is compiled with C++. Here are my presentation slides in PDF format: http://airs.com/ian/cxx-slides.pdf . I have not yet committed any patches to the branch--at present it is just a copy of the trunk. I will start committing patches soon, and anybody else may submit patches as well. The branch will follow the usual gcc maintainership rules, except that any non-algorithmic maintainer may additionally approve or commit patches which permit compilation with C++. I have committed the appended patch to htdocs/svn.html. Ian Index: svn.html =================================================================== RCS file: /cvs/gcc/wwwdocs/htdocs/svn.html,v retrieving revision 1.86 diff -u -r1.86 svn.html --- svn.html 12 Jun 2008 14:22:37 -0000 1.86 +++ svn.html 18 Jun 2008 06:00:13 -0000 @@ -410,6 +410,14 @@ <code>[function-specific]</code> in the subject line. The branch is maintained by Michael Meissner.</dd> + <dt>gcc-in-cxx</dt> + <dd>This branch is for converting gcc to be written in C++. Patches + should be marked with the tag <code>[gcc-in-cxx]</code> in the + subject line. This branch operates under the general gcc + maintainership rules, except that any non-algorithmic maintainer is + additionally permitted to approve changes which permit compilation + with C++. The branch is maintained by Ian Lance Taylor.</dd> + </dl> <h4>Architecture-specific</h4>
Posted Jun 18, 2008 14:55 UTC (Wed)
by bboissin (subscriber, #29506)
[Link]
Posted Jun 18, 2008 15:23 UTC (Wed)
by seanyoung (subscriber, #28711)
[Link] (43 responses)
Posted Jun 18, 2008 15:31 UTC (Wed)
by johnkarp (guest, #39285)
[Link] (12 responses)
Posted Jun 18, 2008 20:54 UTC (Wed)
by khim (subscriber, #9252)
[Link] (2 responses)
Remember - for a long, long time C++ compiler was C++-to-C compiler. Only one thing can not be efficiently implemented in C++-to-C compiler: exceptions. Everything else is just as efficient in C and in C++... except templates: where C++ compiler can generate series of functions automatically in C you either need horrible preprocessor cludges and/or code duplication. Of course the same capability can be easily abused and lead to slower code!
Posted Jun 19, 2008 9:48 UTC (Thu)
by nix (subscriber, #2304)
[Link] (1 responses)
Posted Jun 19, 2008 14:04 UTC (Thu)
by khim (subscriber, #9252)
[Link]
Cfront defined C++ for almost 10 years. Between 1983 and 1993 Cfront was the C++ compiler - and a lot of limitations of C++ back then were justified by the need to compile to C. It was reference implementation, it was used to define "standard C++" (people even used versions of Cfront to explain what type of language their compiler supported), etc. Eventually Cfront 4.0 was abandoned and after long period of instability (~5years) we've got consolidated standard and language was set in stone. But yes, for a long time needs of C++-to-C compiler defined the language.
Posted Jun 19, 2008 9:34 UTC (Thu)
by cate (subscriber, #1359)
[Link] (8 responses)
Posted Jun 19, 2008 10:01 UTC (Thu)
by nix (subscriber, #2304)
[Link] (3 responses)
Posted Jun 19, 2008 11:53 UTC (Thu)
by cate (subscriber, #1359)
[Link] (2 responses)
Posted Jun 19, 2008 18:06 UTC (Thu)
by nix (subscriber, #2304)
[Link]
Posted Jun 19, 2008 18:39 UTC (Thu)
by ncm (guest, #165)
[Link]
Posted Jun 19, 2008 15:20 UTC (Thu)
by endecotp (guest, #36428)
[Link] (1 responses)
Posted Jun 26, 2008 12:45 UTC (Thu)
by tbrownaw (guest, #45457)
[Link]
Posted Jun 19, 2008 21:03 UTC (Thu)
by pynm0001 (guest, #18379)
[Link]
Posted Jun 19, 2008 21:10 UTC (Thu)
by pynm0001 (guest, #18379)
[Link]
Posted Jun 18, 2008 15:38 UTC (Wed)
by tetromino (subscriber, #33846)
[Link] (1 responses)
Posted Jun 18, 2008 16:55 UTC (Wed)
by jengelh (subscriber, #33263)
[Link]
Posted Jun 18, 2008 15:59 UTC (Wed)
by hans (guest, #148)
[Link] (27 responses)
Posted Jun 18, 2008 16:06 UTC (Wed)
by endecotp (guest, #36428)
[Link]
Posted Jun 18, 2008 17:00 UTC (Wed)
by rompel (subscriber, #4512)
[Link] (25 responses)
Posted Jun 18, 2008 17:06 UTC (Wed)
by epa (subscriber, #39769)
[Link] (1 responses)
Posted Jun 19, 2008 7:16 UTC (Thu)
by eric.rannaud (guest, #44292)
[Link]
Posted Jun 18, 2008 17:36 UTC (Wed)
by pphaneuf (guest, #23480)
[Link] (14 responses)
One problem with this (among others) is that it becomes very hard to debug. For example, any issue inside the macro gets reported on the line number that invoked it.
Posted Jun 18, 2008 18:11 UTC (Wed)
by gnb (subscriber, #5132)
[Link] (12 responses)
Posted Jun 18, 2008 21:04 UTC (Wed)
by pphaneuf (guest, #23480)
[Link]
Right, but I'll take too much information over too little any given day.
It's kind of funny, but there are filters you can use to make those more readable. I think they're integrating some of these changes in GCC? Or maybe the diagnostic messages mandated in C++0x are going to be better, I don't remember...
Posted Jun 18, 2008 21:21 UTC (Wed)
by ncm (guest, #165)
[Link] (10 responses)
Posted Jun 18, 2008 22:03 UTC (Wed)
by ajross (guest, #4563)
[Link] (9 responses)
Posted Jun 18, 2008 22:28 UTC (Wed)
by pphaneuf (guest, #23480)
[Link] (6 responses)
This is because template parameters are untyped (they are types themselves!), and concepts are basically typing for types. C preprocessor macros have exactly the same problem. Consider this:
You can see that there is no typing for X there, and the same applies to template parameters (except they actually do have a little bit of typing, but not much).
Also, you might note that in C, this compiles with a warning, then crashes at runtime. In C++, compile-time error, because, well, it is. That's an example of what I was saying earlier: program in C, but call your file foo.cc instead of foo.c, and it will be better. You won't get any of those horrible template diagnostic messages (you don't use templates!), you'll in fact have better warnings and errors.
Posted Jun 19, 2008 0:37 UTC (Thu)
by flewellyn (subscriber, #5047)
[Link] (5 responses)
Posted Jun 19, 2008 3:02 UTC (Thu)
by pynm0001 (guest, #18379)
[Link] (3 responses)
Posted Jun 19, 2008 15:09 UTC (Thu)
by hummassa (guest, #307)
[Link] (2 responses)
Posted Jun 19, 2008 18:08 UTC (Thu)
by nix (subscriber, #2304)
[Link] (1 responses)
Posted Aug 24, 2008 11:08 UTC (Sun)
by hummassa (guest, #307)
[Link]
Posted Jun 19, 2008 3:58 UTC (Thu)
by elanthis (guest, #6227)
[Link]
Posted Jun 18, 2008 22:47 UTC (Wed)
by ncm (guest, #165)
[Link] (1 responses)
Posted Aug 7, 2009 19:50 UTC (Fri)
by hummassa (guest, #307)
[Link]
Sure it can. The compiler has access to the source of the template; it will have to generate code based on that source eventually, so it knows what it needs.
Even if the compiler does not "plan ahead", during the instantiation phase it can deduce WHY the instantiation went wrong, instead of spilling WHAT went wrong with it... and point to the right lines of code, and giving meaningful class names instead of "complete" class names (IOW:
std::string
)...
Posted Jun 19, 2008 7:19 UTC (Thu)
by eric.rannaud (guest, #44292)
[Link]
Posted Jun 18, 2008 19:06 UTC (Wed)
by ncm (guest, #165)
[Link] (7 responses)
Posted Jun 18, 2008 20:00 UTC (Wed)
by flewellyn (subscriber, #5047)
[Link] (6 responses)
Try doing that with C macros, or even Lisp macros. You wouldn't use macros for that in (Common) Lisp, that's not what they're for. You'd use CLOS generic functions, which would dispatch on the argument types. Then you get a nice, natural, functional interface with multiple dispatch as a free bonus.
Posted Jun 18, 2008 21:00 UTC (Wed)
by ncm (guest, #165)
[Link] (3 responses)
Posted Jun 18, 2008 21:29 UTC (Wed)
by flewellyn (subscriber, #5047)
[Link] (2 responses)
Posted Jun 18, 2008 21:51 UTC (Wed)
by ncm (guest, #165)
[Link] (1 responses)
Posted Jun 18, 2008 21:58 UTC (Wed)
by flewellyn (subscriber, #5047)
[Link]
Posted Jun 18, 2008 21:10 UTC (Wed)
by ncm (guest, #165)
[Link] (1 responses)
Posted Jun 18, 2008 21:31 UTC (Wed)
by flewellyn (subscriber, #5047)
[Link]
Posted Jun 18, 2008 15:35 UTC (Wed)
by BrucePerens (guest, #2510)
[Link] (1 responses)
Posted Jun 19, 2008 0:28 UTC (Thu)
by qg6te2 (guest, #52587)
[Link]
Posted Jun 18, 2008 16:18 UTC (Wed)
by pphaneuf (guest, #23480)
[Link]
This is good news, I think.
I have very often found bugs in C code by renaming the source file to .cc, and using the C++ compiler as a lint tool. Even if you don't change a single line of code (well, other than conflicts with the very few new keywords, and other minor incompatibilities), you generally get a better program with C++, because it slipped a few less "quick ones" (like casting pointers around for you, or enums being really just ints).
If you use the standard library too to avoid rewriting the 324785124th implementation of a linked list (hmm, I wonder if which of the two is more likely to be buggy?), so much the better.
Posted Jun 18, 2008 16:18 UTC (Wed)
by nix (subscriber, #2304)
[Link] (2 responses)
Posted Jun 18, 2008 17:22 UTC (Wed)
by pphaneuf (guest, #23480)
[Link] (1 responses)
This shouldn't be too much of a problem, since TR1 does not affect the compiler, but only the standard library. In addition, many of those are all in header files, it would be very easy to just package them with GCC as needed.
Posted Jun 18, 2008 17:26 UTC (Wed)
by nix (subscriber, #2304)
[Link]
Posted Jun 18, 2008 17:20 UTC (Wed)
by madscientist (subscriber, #16861)
[Link] (5 responses)
Posted Jun 18, 2008 21:56 UTC (Wed)
by linuxrocks123 (subscriber, #34648)
[Link] (2 responses)
Posted Jun 18, 2008 22:11 UTC (Wed)
by madscientist (subscriber, #16861)
[Link]
Posted Jun 18, 2008 22:19 UTC (Wed)
by stevenb (guest, #11536)
[Link]
Posted Jun 19, 2008 0:38 UTC (Thu)
by gdt (subscriber, #6284)
[Link] (1 responses)
I don't have a problem with this as a concept, but IMO Ian doesn't fully confront the most difficult issue (IMO): bootstrapping. If it is such a difficult problem then I suppose a side-effect of the proposal will be to change the strategy for bringing up GCC on a new platform from bootstrapping to cross-compiling.
Posted Jun 19, 2008 1:32 UTC (Thu)
by vomlehn (guest, #45588)
[Link]
Posted Jun 18, 2008 17:59 UTC (Wed)
by foo (guest, #1117)
[Link]
Posted Jun 18, 2008 18:24 UTC (Wed)
by ncm (guest, #165)
[Link] (2 responses)
In my 22 years' experience, the language's worst problems are in its C compatibility subset. Thus, whatever "worst language problems" the conversion might encounter are already manifest in the Gcc codebase. Converting to C++ offers a route to avoid them. That point was central to his presentation; 8 out of the 17 the slides illustrate it. A scrupulous summary would note that instead.
Posted Jun 18, 2008 18:40 UTC (Wed)
by corbet (editor, #1)
[Link] (1 responses)
I was thinking of slide 13, which notes "we would only use features which are worthwhile," and 14, adding "Maintainers will ensure that gcc continues to be maintainable." These are answers to charges that C++ is too slow and too complex. If I've overstated what Ian meant, I apologize.
Posted Jun 18, 2008 20:51 UTC (Wed)
by ncm (guest, #165)
[Link]
Posted Jun 18, 2008 18:45 UTC (Wed)
by pynm0001 (guest, #18379)
[Link] (11 responses)
Posted Jun 18, 2008 18:55 UTC (Wed)
by dwheeler (guest, #1216)
[Link] (2 responses)
It's relatively easy to develop a C compiler that generates running code (may not be efficient, but it runs). It's harder to create a C++ compiler. Thus, there are more C compilers, which can act as a check on the gcc C compiler.
Posted Jun 18, 2008 19:12 UTC (Wed)
by pynm0001 (guest, #18379)
[Link] (1 responses)
Um, fair enough, but this is like using autoconf so that your program can build on 10 year old
AIX machines... it's optimizing for a problem that only precious few people care about, and the
other 99% of people who could benefit would instead have to suffer. Which is why we have the
explosion in new build systems... :-/ Those who are really worried that Ubuntu has corrupted their g++ binaries can use pcc to
compile an older version of gcc I suppose. But you leave out one thing. Can an ANSI C compiler build gcc? I'm pretty sure that gcc
requires gcc-extensions to C to build at this point anyways so you already need to trust gcc if you
use it as your compiler. In addition if you look at Ian's slides on how things could look I would claim
that a C++ implementation would at least be easier to perform code review on, and even static
analysis. Actually you could simply build a current g++ and place it on read-only media and use it to build
the new C++-based gcc. If it's different from the installed version then perhaps there has been the
malware code slipped into the compiler as described in Thompson's article. But I don't see how
simply having the compiler in C helps in this case. You still need a "safe" version of gcc, and that
already compiles C++.
Posted Jun 18, 2008 19:19 UTC (Wed)
by willy (subscriber, #9762)
[Link]
Posted Jun 18, 2008 19:10 UTC (Wed)
by ncm (guest, #165)
[Link] (4 responses)
Posted Jun 18, 2008 19:46 UTC (Wed)
by pynm0001 (guest, #18379)
[Link]
Sounds good then. I just got through reading the slides and I think there could be great gain by
doing nothing more than converting the already-existing object-oriented code (like TARGETS) to
appropriate C++ and using the standard C++ containers instead of the various ad-hoc routines that
look to be scattered in the code.
Posted Jun 18, 2008 20:55 UTC (Wed)
by jordanb (guest, #45668)
[Link] (2 responses)
Posted Jun 18, 2008 21:16 UTC (Wed)
by ncm (guest, #165)
[Link]
Posted Jun 19, 2008 12:03 UTC (Thu)
by nix (subscriber, #2304)
[Link]
Posted Jun 18, 2008 22:36 UTC (Wed)
by pphaneuf (guest, #23480)
[Link] (1 responses)
The greatest thing is that C is almost all the time C++. You don't necessarily have to use classes or templates. In my experience, just having stronger type checking is already a big win over plain C. Then you can do a few easy things like using a vector<Foo> instead of a manually managed array of Foo, and so on, making for that fewer bugs.
Posted Jun 19, 2008 0:27 UTC (Thu)
by pphaneuf (guest, #23480)
[Link]
If he gets numbers (including source code size) half as nice as what he got with gold, it should be a shoo-in. ;-)
Posted Jun 19, 2008 15:06 UTC (Thu)
by renox (guest, #23785)
[Link]
Posted Jun 18, 2008 20:51 UTC (Wed)
by clugstj (subscriber, #4020)
[Link] (39 responses)
Posted Jun 18, 2008 21:11 UTC (Wed)
by foo (guest, #1117)
[Link] (2 responses)
Posted Jun 19, 2008 5:59 UTC (Thu)
by joib (subscriber, #8541)
[Link] (1 responses)
Posted Jun 19, 2008 7:00 UTC (Thu)
by jengelh (subscriber, #33263)
[Link]
Posted Jun 18, 2008 21:40 UTC (Wed)
by pr1268 (subscriber, #24648)
[Link] (35 responses)
Fat chance. Linus has publicly stated (on the LKML, etc.) his disdain for C++. Not that I'm anti-C++; to the contrary, I'm a die-hard C++ fanboy. But, I'm a bigger fan of the philosophy of using the right tool for the right job. While I personally write lots of user-space software in C++, I do think that C is better for certain applications. Including OS kernels. And besides, to defend Linus' attitude, it's also about using the tool you're most comfortable with. As an aside, I like to facetiously describe the C language as, "assembly language after a makeover," which has some truth to it (considering Thompson's and Richie's motivations for defining the language).
Posted Jun 18, 2008 22:05 UTC (Wed)
by edschofield (guest, #39993)
[Link] (13 responses)
Posted Jun 18, 2008 22:45 UTC (Wed)
by aleXXX (subscriber, #2742)
[Link]
Posted Jun 18, 2008 23:02 UTC (Wed)
by pr1268 (subscriber, #24648)
[Link]
Yup, that's EXACTLY the message I was referring to when I mentioned Linus' well-known attitude towards C++. I was just too lazy to look it up on the LKML archives (and thank you for doing so, BTW!)
Posted Jun 18, 2008 23:22 UTC (Wed)
by pphaneuf (guest, #23480)
[Link]
Most of his claims used to be correct.
There is a certain "mind set" factor to consider, and while I would personally prefer to hack on C++ than C, I understand him wanting to keep a certain type of people away, as sad as this is...
Posted Jun 19, 2008 0:16 UTC (Thu)
by MisterIO (guest, #36192)
[Link] (9 responses)
Posted Jun 19, 2008 1:45 UTC (Thu)
by vomlehn (guest, #45588)
[Link] (7 responses)
Posted Jun 19, 2008 7:18 UTC (Thu)
by MisterIO (guest, #36192)
[Link] (6 responses)
Posted Jun 19, 2008 9:27 UTC (Thu)
by pr1268 (subscriber, #24648)
[Link] (5 responses)
Beauty is in the eye of the beholder. As is ugliness. I guarantee you that C is just as ugly (or pretty) as C++. Define ugly with respect to programming languages. Can anyone show me a PRETTY language? I'm talking about showing some random person on the street a printout of source code and having him/her comment on how aesthetically pleasing it looks (or how it reads). And those programs on the IOCCC whose code is laid out in some image pattern don't count since they'd be mistaken for ASCII art.
Posted Jun 19, 2008 12:07 UTC (Thu)
by nix (subscriber, #2304)
[Link]
Posted Jun 19, 2008 15:46 UTC (Thu)
by MisterIO (guest, #36192)
[Link] (3 responses)
Posted Jun 19, 2008 17:25 UTC (Thu)
by aleXXX (subscriber, #2742)
[Link]
Posted Jun 19, 2008 17:26 UTC (Thu)
by pr1268 (subscriber, #24648)
[Link] (1 responses)
My questions bordered on rhetorical (plus I poured on the sarcasm pretty thick, especially with the "random person on the street" bit). It's okay if you simply prefer C over C++, just as I prefer red over blue. But I still don't see that much difference with the two languages syntactically, although there are vast technical and functional differences. Generally speaking, when I hear/read someone say they prefer one programming language over another, and the two are as similar (to me) as C and C++, then my graduate CS education and critical thinking skills kick into gear, and I often start asking others, "why?". :-)
Posted Jun 19, 2008 20:18 UTC (Thu)
by pphaneuf (guest, #23480)
[Link]
Even though it's not strictly true, for the most part, C++ is a superset of C, so it's not like there is much to gain from going from C++ to C. Whereas, there is a number of options appearing if you go the other way, which you obviously don't have to use in every single program (C++ is quite multi-faceted, if you have some time at hand, I would recommend this talk on C++ stylistics, it's quite interesting). But as it turns out, we often want to use things like objects and such (look at the kernel, or GObject, for example).
Posted Jun 19, 2008 21:50 UTC (Thu)
by mattmelton (guest, #34842)
[Link]
Posted Jun 18, 2008 22:45 UTC (Wed)
by pphaneuf (guest, #23480)
[Link] (19 responses)
I actually think C++ would be quite suited to kernel work, but just like when using C, great restraint would have to be shown. As in C, a lot of the standard library wouldn't be available, for example. You'd have to be extremely careful with template instantiations causing bloat. You'd probably have to turn off both RTTI and exceptions. Every use of virtual methods would have to be scrutinized.
But it's more or less similar things that you do when you write a kernel in C. Well, okay, there's no RTTI or exceptions to turn off, but you get the idea...
In fact, if I'm not mistaken, Darwin has at least parts of its kernel (the IOKit) is written in C++. I've heard good things about the ease of developing drivers for it...
Posted Jun 18, 2008 22:52 UTC (Wed)
by aleXXX (subscriber, #2742)
[Link] (12 responses)
Posted Jun 18, 2008 23:17 UTC (Wed)
by pphaneuf (guest, #23480)
[Link] (2 responses)
The Linux kernel uses a close equivalent of virtual methods in a lot of places, in fact. I'm just saying you shouldn't just pepper them everywhere, you should think a bit about it when designing, which is just a plain good idea, IMHO, especially when working on kernel code!
The way the "virtual methods" are done in the Linux kernel have an interesting characteristic that is not straightforward to simulate with C++: each instance has its own virtual method table, instead of all the instances of a given class sharing a single one. Which particular version of a method is active is part of the state of the object, and a clever implementation could avoid further branching inside the method by adjusting its dispatch table when its state changes. For example, instead of checking if there is an error at the beginning of a method that cannot be called in an error state, you can skip the check altogether, and when there is an error, change the pointer to another function that only returns with the error.
Thinking about this, this is probably not SMP safe. Oh well. :-)
I presume there are also some other effects on efficiency from doing it this way (memory footprint is larger, but locality is improved, etc), but I cannot say for sure whether one is better than the other, it would need rigorous quantification.
Posted Jun 19, 2008 6:59 UTC (Thu)
by ikm (guest, #493)
[Link] (1 responses)
Posted Jun 19, 2008 14:41 UTC (Thu)
by pphaneuf (guest, #23480)
[Link]
My bad, I've only done a little bit of kernel programming, and somehow remembered the various ops structure as being stored by value, rather than being pointed at.
In that case, it is very much like C++: an object is an instance of struct file, and f_op is the same thing as the C++ vptr.
The way C++ does inheritance is a little bit more efficient than the way the Linux kernel does it when additional data members are needed, effectively appending them at the end of the struct file, where the Linux kernel seems to use a separate allocation pointed at by private_data.
Again, this is not something that C++ would magically do better, it could be done in C as well (in fact, the GObject system does it), but I think it saves typing and boilerplate code the way it is, to the expense of a tiny bit of efficiency. There might be a preference for two small allocations in the kernel, as well, rather than a large one, to avoid higher-order allocations, which could possibly require more than one physical page and become difficult to fulfill? I do not know if this is correct.
Posted Jun 18, 2008 23:19 UTC (Wed)
by pphaneuf (guest, #23480)
[Link] (8 responses)
Oh, I was forgetting...
According to this page, IOKit is in a restricted subset of C++, based on Embedded C++.
Posted Jun 18, 2008 23:39 UTC (Wed)
by aleXXX (subscriber, #2742)
[Link] (5 responses)
Posted Jun 19, 2008 0:24 UTC (Thu)
by pphaneuf (guest, #23480)
[Link] (4 responses)
Posted Jun 19, 2008 1:13 UTC (Thu)
by ncm (guest, #165)
[Link] (3 responses)
Posted Jun 19, 2008 7:05 UTC (Thu)
by ikm (guest, #493)
[Link] (2 responses)
Posted Jun 19, 2008 14:56 UTC (Thu)
by pphaneuf (guest, #23480)
[Link] (1 responses)
Memory for the exception is allocated (in GCC) with __cxa_allocate_exception, which in libstdc++, seems like it tries the regular allocator first, and if that fails, uses an area of memory set aside for emergencies. If that runs out, the program aborts.
The implementation is in the runtime library, though, not in the code emitted by the compiler, so you could certainly have something appropriate for your platform.
Posted Jun 19, 2008 15:19 UTC (Thu)
by ikm (guest, #493)
[Link]
Posted Jun 18, 2008 23:43 UTC (Wed)
by ncm (guest, #165)
[Link] (1 responses)
Posted Jun 19, 2008 0:16 UTC (Thu)
by pphaneuf (guest, #23480)
[Link]
Posted Jun 19, 2008 7:02 UTC (Thu)
by epa (subscriber, #39769)
[Link] (5 responses)
Posted Jun 19, 2008 9:41 UTC (Thu)
by cate (subscriber, #1359)
[Link] (4 responses)
Posted Jun 19, 2008 12:13 UTC (Thu)
by nix (subscriber, #2304)
[Link]
Posted Jun 19, 2008 15:26 UTC (Thu)
by renox (guest, #23785)
[Link] (2 responses)
Posted Jun 20, 2008 8:44 UTC (Fri)
by epa (subscriber, #39769)
[Link] (1 responses)
Posted Aug 14, 2008 23:47 UTC (Thu)
by cortana (subscriber, #24596)
[Link]
Posted Jun 19, 2008 22:28 UTC (Thu)
by brianomahoney (guest, #6206)
[Link] (3 responses)
Posted Jun 20, 2008 2:51 UTC (Fri)
by pynm0001 (guest, #18379)
[Link] (2 responses)
Posted Jun 20, 2008 8:14 UTC (Fri)
by brianomahoney (guest, #6206)
[Link] (1 responses)
Posted Jun 24, 2008 20:06 UTC (Tue)
by ncm (guest, #165)
[Link]
Posted Jun 20, 2008 18:13 UTC (Fri)
by jfj (guest, #37917)
[Link] (1 responses)
Posted Jun 24, 2008 20:27 UTC (Tue)
by ncm (guest, #165)
[Link]
Posted Jun 22, 2008 14:14 UTC (Sun)
by rwmj (subscriber, #5474)
[Link] (9 responses)
I've just read 100+ comments, and I think only one has suggested
using a language which is actually suited to the task at hand.
If you're writing a compiler, use a programming language designed
for writing compilers - for example, ML or
its derivatives.
Rich.
Posted Jun 22, 2008 22:48 UTC (Sun)
by nix (subscriber, #2304)
[Link] (1 responses)
Posted Jun 25, 2008 8:53 UTC (Wed)
by rwmj (subscriber, #5474)
[Link]
Yeah, but GCC-in-ocaml would be a big job, basically a complete rewrite.
Sure, I absolutely wasn't suggesting this, just moaning
that people aren't talking about good languages (in general).
What might work though would be to implement some sort of macro
system in C. NOT C++ templates (which are a completely on-crack "macro"
/ "metaprogramming" system), but something designed
along the lines of defmacro or camlp4.
With such a macro system, small parts of gcc could be
rewritten for brevity and code consistency. Basically
whenever you have a common pattern of repeated code,
you try to abstract it into a useful general-purpose
macro (if appropriate).
Rich.
Posted Jun 24, 2008 20:16 UTC (Tue)
by ncm (guest, #165)
[Link] (6 responses)
Posted Jun 25, 2008 8:48 UTC (Wed)
by rwmj (subscriber, #5474)
[Link] (5 responses)
I have actually hacked on gcc, just in case you thought
I was some sort of noob who knows nothing about its internals. Its template system, incidentally, was designed to match ML's capabilities.
C++ templates are insane. Please learn something about ML
before making such silly statements.
Rich.
Posted Jun 25, 2008 23:15 UTC (Wed)
by nix (subscriber, #2304)
[Link]
Posted Jun 26, 2008 7:42 UTC (Thu)
by ncm (guest, #165)
[Link] (3 responses)
Posted Jun 26, 2008 8:29 UTC (Thu)
by rwmj (subscriber, #5474)
[Link] (2 responses)
FC++ is a great example of why just because something can be done,
it shouldn't be done. Here is their definition of 'map' (not
including the huge amount of supporting boilerplate you need to
make things like lists and lambdas work):
Compare that to the (non-tailrec) implementation in OCaml:
To be honest I couldn't get g++ to compile the examples
from that website. I got plenty of delicious C++ error
messages though, eg:
So yes, insane.
If this is the current state of "industrial strength" programming,
I'll stick with my toy languages thanks.
Rich.
Posted Jun 26, 2008 23:40 UTC (Thu)
by ncm (guest, #165)
[Link] (1 responses)
Posted Jun 27, 2008 7:23 UTC (Fri)
by rwmj (subscriber, #5474)
[Link]
OK, now try writing a syntax extension to C++ to hide
the useless extra complexity.
I wrote bitmatch for
OCaml (examples at that site).
The code inside bitmatch (q.v.)
is indeed as grotty as the C++ code. However
the user of bitmatch never sees this. They just get to
use an elegant, simple matching syntax, and if they
make an error, they get short, descriptive errors.
You see, there's a right way and a wrong way to implement
macros, and C++ definitely implements them really badly.
I really don't care if there's an ISO standard for a language.
Provided there's a single free (as in speech) implementation,
then I'm guaranteed all the same assurances and freedoms
(and arguably more) than an ISO standard.
Rich.
Converting GCC to C++
They could just fork LLVM then ;)
Sometimes C++ is faster
I don't want to be dogmatic about this, but I do admit a great affinity for plain C. Things
like templates might very well be a good thing.
However, whenever I read statements like "Sometimes C++ is faster" I just don't get it. What
can C++ do which produces faster code? What am I missing?
Sometimes C++ is faster
IIRC, if you need inheritance and virtual functions, its much faster to
use C++'s builtins rather than to kludge your own in C. Also, the C
standard library string functions can be pretty inefficient.
Virtual functions have the same speed
Virtual functions have the same speed
For a long, long time? I think the gap was something like three years between Cfront and G++.
It might have felt like a long time then, but it really wasn't. :)
10 years is long time in computing
Sometimes C++ is faster
virtual function are a lot slower than the direct C equivalent. Because of class expandability
and multiple heritage, the virtual table is not put together data, but accessed with an
additional pointer. So to call a virtual function CPU will use two pointers, which is cache
and pre-fetch inefficient.
Ideally a C++ optimized (program level, not unit level) should know if a class is expanded and
it should allocates some part of virtual table together to the data. (maybe there is also a
pragma directive). So in this trade-off, c++ chooses the slower but more expandable method.
Anyway I think he said faster, because the STL are optimized (probably ugly code, but well
tested and anyway not mixed with program), which is not the case of normal c++ programming.
But we speak about compiler developers, which should know very well C++ and ugly optimization,
so for GCC I don't think STL will give significant improvements.
C++ has also linker problem: dynamic linking is a lot slower (number of function, but also
because long names, which differentiate only at the end)
IMHO the readability, extendibility of code is better that fast code (but for gentoo users
;-)).
Sometimes C++ is faster
My understanding is that what you say is true in modern compilers only for virtual functions
defined in virtual base classes. Non-virtual-base class virtual function lookup requires only
a single pointer dereference, from a fixed offset in the VMT, as does lookup of a virtual
function defined in a class that has virtual bases, but not itself defined in that virtual
base class or one of its ancestors.
Virtual base classes are rare. This is one reason why. (Another reason is that, well, the
designs that require virtual bases are generally either involuted or icky.)
Sometimes C++ is faster
I'm not so sure that I understand your comment. My concern is about where compilers put the
VMT table. If it is split from the data (as my (old) understanding and as I see in wikipedia),
every class requires a pointer to the VMT, so there is two dereferences.
The solution was to put some part (i.e. the first 10 virtual function pointer) into the class
data, but because it cause more memory demand (i.e. event driven programming), I don't think
it is default.
Thus I like the C++ idea, if it bring to more development, especially on the optimization at
program level (and not only at unit level).
Sometimes C++ is faster
Oh, yeah, you have to jump to the VMT first, so two dereferences it is.
Sometimes there are *more* than two, though (and sometimes there are none,
as when the compiler can determine the static type of the instance: this
happens more often than you might expect).
Usually C++ is faster
Lots of falsehoods, misconceptions, and old myths to fix here.
First, the STL containers and algorithms don't use virtual functions at all. Virtual
functions aren't used much in modern C++ code, particularly in places where it matters how
fast they are or aren't. This isn't so much because people worry much about how fast they
are, but just because they are not often especially useful in low-level code where speed
matters.
Second, whatever overhead is associated with a virtual function call has very little to do
with how many memory accesses occur, and everything to do with pipeline stalls. If the speed
of a virtual function call matters, it's in a loop, and all the relevant memory is in cache.
You might lose a cycle or two loading a memory word, then, but you lose a dozen or two
branching through a pointer because the execution pipeline can't look ahead past that branch.
If you used a function pointer in C, you would suffer precisely the same stall. (There are
further complications I won't get into here.)
So, as rules of thumb: (1) if it matters how fast a virtual function call is, compared to a
regular function call, you're probably doing it wrong; and (2) a virtual function call is
architecturally equivalent to calling through a C function pointer. How much do you use
function pointers in C? That's about how much you should use virtual functions in C++. (If
you're more used to Java... may heaven grant mercy on your soul.)
In practice, the time spent doing virtual function calls has no measurable effect on the speed
of a well-conceived C++ program.
Sometimes C++ is faster
What you say is true, but it's mitigated in a few ways:
- None of this matters if the actual class of the object is known. It only matters if you're
accessing a derived class via a base class pointer.
- If you're accessing several virtual members, the dereference to get the virtual function
table only needs to be done once.
- Virtual methods are rare in C++ (compared to e.g. Java). I don't believe there are any at
all in the standard library, for example.
- Using less memory per object will improve your cache hit rate when you have more than a few
objects.
Here's a C++ benchmark. "----" indicates a new file, to prevent too much being optimised
away:
struct base {
virtual void foo() = 0;
};
struct derived: public base {
void foo() {
f1();
}
};
----
int main()
{
for (int i=0; i<100000000; ++i) {
base* p = f2();
p->foo();
}
}
----
void f1() {
}
base* f2() {
static derived d;
return &d;
}
And here's a C version (well, actually it's still C++ but it's using a function pointer in the
struct rather than a virtual function):
struct base {
typedef void(*foo_t)();
const foo_t foo;
base(foo_t f_): foo(f_) {}
};
----
int main() // same as C++ version
{
for (int i=0; i<100000000; ++i) {
base* p = f2();
p->foo();
}
}
----
void f1() {
}
struct derived: public base {
derived(): base(&f1) {}
};
base* f2() {
static derived d;
return &d;
}
Comparing these programs (x86, gcc 4.3.1, -O3) I find that the "C" version is about 20%
faster, or around 6 nanoseconds per call, on this 1 GHz VIA C3 machine. That fraction would
clearly drop if you actually did something inside the virtual function.
Sometimes C++ is faster
- Virtual methods are rare in C++ (compared to e.g. Java). I don't believe there are any at
std::streambuf::xs{get,put}n and several other protected streambuf functions. Not something you'll usually work with directly, but used by all the i/o streams.
all in the standard library, for example.
Sometimes C++ is faster
> virtual function are a lot slower than the direct C equivalent.
How would you do in C then? Because however it is, it can be done that way in C++.
More seriously though, you are saying that C++ does something like:
this->v_btl->virtual_call(params); right?
I would agree, but how would you do this any faster in C? Recognize that you may not cast the
object pointer to a derived class, the compiler can't (necessarily) know that the conversion
is valid, and you could cast it in C++ if the programmer happens to know more than the
compiler.
i.e. if you were to do:
derived_virtual_impl(klass, params) in C, you could do:
obj->Derived::impl(params) in C++ and call the appropriate implementation directly.
Is there some neat-o technique I'm missing?
Sometimes C++ is faster
Sorry about replying twice but I missed this one too:
> C++ has also linker problem: dynamic linking is a lot slower (number of
> function, but also because long names, which differentiate only at the
> end)
This is annoyingly true. However, this has pretty much been rectified.
With recent binutils (I think 2.17.50 or later?) a new hash table format is used for ELF
dynamic binaries which significantly reduces the amount of string comparisons that must be
done in order to properly load a symbol. Of course long symbol names still take a while to
load so it is important to reduce the amount of symbols used to the minimum necessary. If you
have recent binutils you may already be able to see the fruits of this, you can use readelf -d
to see if you have a GNU_HASH section of a binary or a .so.
g++ since 4.0 has supported visibility for symbols, including symbols defined as part of C++
class and template generation, which takes care of that problem. The problem is that C++
libraries must specifically support it, but support for that is growing.
prelinking provides benefits on top of that. There are patches from Michael Meeks floating
around to allow a flag called -Bsymbolic which apparently also helps, but I don't really know
what it does beyond that, and it does not look like it will make it into binutils anyways.
But there's lots of work being done to rectify these problems, so it lookslike we're *almost*
getting to the point to where we can have our cake and eat it too. :)
Sometimes C++ is faster
For many functions that are both in glibc and g++'s STL, the STL version is faster.
Sometimes C++ is faster
This was not true for GNU's stdlibc++3.3. Thankfully they fixed it.
Sometimes C++ is faster
I agree with the other responses, and I'll add a concrete example: qsort. The qsort function
that is in the standard C library takes a void pointer and a callback as a parameter so that
the same algorithm can be used for multiple types. The void pointer can point to an array of
chars, ints, etc. and the callback is used to compare two members of the array. But that
callback adds overhead, which often leads developers to hand-code a quicksort function for
each type as an optimization.
In the C++ Standard Template Library, there is also a sort function, but since it uses
templates, the algorithm can be used regardless of the underlying type. So the C++ compiler
can optimize the template-based sort to be as efficient as the hand-coded function. Or that's
the theory, anyway.
Sometimes C++ is faster
Try this:
http://theory.stanford.edu/~amitp/rants/c++-vs-c/
for numbers supporting this analysis.
Of course you could implement qsort as a C macro and get the same advantage. See here for example. No doubt the template implementation is cleaner, but not all that much. Once again, C++ provides some nice syntactic sugar, but no fundamental advantage.
Sometimes C++ is faster
Sometimes C++ is faster
However, do you really fancy writing all of your algorithms as macros? Of course impressive
hacks are possible as one-offs.
Sometimes C++ is faster
It is not really a one time feat: OpenSSL has a good example of this for
type-checked hash tables: see lhash(3ssl) or
http://www.openssl.org/docs/crypto/lhash.html
Sometimes C++ is faster
Sometimes C++ is faster
True, but compilation errors in C++ template use (the alternative being
considered here) frequently aren't models of clarity either.
Sometimes C++ is faster
C++0x to save the day
C++0x, the next Standard C++, will support "concepts", whose chief benefit is to allow library
writers to make the compiler produce comprehensible error messages for library usage errors.
Therefore, this problem is temporary.
C++0x to save the day
This is a joke, right? The solution to "this language's error messages suck" is "make it the
library author's job to fix it". I await such a feature with baited breath. Truly, this will
revolutionize programming. Just think about how we've all been suffering in modern languages
with their readable and non-configurable errors! Woe!
Pathetic. I'm sorry, but C++ has jumped the shark at this point. The language is now
self-parody. It appeals not to productive hackers, but to gadget freaks and language theory
nerds.
And yeah, the snark is flowing freely in this post. But I stand by every word of it.
C++0x to save the day
#include <stdio.h>
// Ignore that this is unsafe, it's just an example.
#define mymacro(X) printf(X)
void main() {
mymacro(42);
}
C++0x to save the day
I see the problem: C++ creates a metalanguage to describe types in the language. But the
metalanguage itself lacks types, so you create a meta-metalanguage to describe types in the
metalanguage. What happens when you want to have types in concepts? A
meta-meta-metalanguage?
This is an argument in favor of reflection.
C++0x to save the day
Concepts are not types, and are not a panacea, as what they test for will have to be kept in
sync with what the template class actually requires. But acting like it's some kind of
infinite recursive type is foolhardy and leads me to believe you haven't looked at concepts
yet. ;)
Or to put it another way, why are people inundated in metametametainfo nowadays, what with the
explosion of features that take advantage of metainfo about our information?
But why are we complaining about this anyways? unwieldy compiler error messages *are* a
problem, and if were as simple as simply changing the compiler a bit that would have already
been done. Even scripts that try to condense the error messages down a bit after the fact
don't help. Concepts allow you to fail-fast when compiling.
error: foo.cpp:15 -- Cannot instantiate template<class T> Foo with T = Widget because Widget
is not Iterable (concept defined in foo.h:17).
is much more useful than an error message talking about missing copy constructors or something
in Widget. Concepts are optional but you only have to put in a bit of work to add them to
your library and they work on everything which uses that template, it doesn't need changes in
your code to derive from IEnumerable or something crazy, which is good if you can't change
your class hierarchy, or simply don't want to derive from a class just to prove you've
implemented some specific functions.
Careful use of concepts will allow a program to be able to avoid having to use adaptor classes
to interface with a template as well, as instead of having to rename functions you can use a
concept_map to tell the template what function to call in class Foo to get the effect of
push_back, for instance. So when the template calls T::push_back, the correct Foo::append()
would get called instead.
So yes, you may have to do extra work, but it's optional. If you don't use concepts then you
get what you have now, if you do use concepts your programs can get better, easier to read and
easier to build. I think of this like the visibility support added to g++ in 4.0. It was a
pain to figure out but it has lead to great speedups so I'm not upset that they developed it.
I'm actually quite pleased.
I have to disagree:
> But why are we complaining about this anyways? unwieldy compiler error
> messages *are* a problem, and if were as simple as simply changing the
> compiler a bit that would have already been done. Even scripts that
> try to condense the error messages down a bit after the fact don't
> help. Concepts allow you to fail-fast when compiling.
This is wrong in many levels; I don't know where to start... so, I'll
reply to your example:
> error: foo.cpp:15 -- Cannot instantiate template<class T> Foo with T =
> Widget because Widget is not Iterable (concept defined in foo.h:17).
The correct error message would be IMHO:
error: foo.cpp:15 -- cannot instantiate template<class T> Foo with T =
Widget because Widget::operator++(int) nor ::operator++(Widget&,int)
exists. Foo<T> is defined at foo.h:20 and uses said operator at foo.h:25.
This is feasible because when Foo<T> was defined, it already had the
information that T needed T::operator*() or ::operator*(T&) and
T::operator++(int) or ::operator++(T&, int), just because foo.h:25 is:
while( *t++ ) { do_something_with(t); }
I understand that "Widget is not Iterable" is even easier than "Widget::
operator*() not defined", so I still think that concepts are a good idea
as documentation facilities, but I can't agree that good error
messages "were impossible to attain before concepts".
Just for the kicks, I tried to see what would be the error message in
such a case:
> um.cc: In static member function 'static int Foo<T>::foo(T) [with T =
> std::basic_string<char, std::char_traits<char>, std::allocator<char>
> >]':
> um.cc:16: instantiated from here
> um.cc:8: error: no 'operator++(int)' declared for postfix '++', trying
> prefix operator instead
> um.cc:8: error: no match for 'operator++' in '++x'
um.cc:8 is the usage of *t++ in Foo<T>::foo(T); actually, the message
contains almost every single useful bit of information, but in a garbled
way: it satys that the problem is the usage of a postfix ++ in line 8,
when instantiated on line 16. But it forgets to say what type it's
instantiated to... this can't be hard to correct IMHO.
I have to disagree:
The [with T=...] bit says what the type the template is being instantiated
with is. (Ow. Nasty center-embedding. Sorry.)
Many days later...
lines where the error ocurred name", /in/ /casu/ "std::string".
C++0x to save the day
Reflection happens at run time, or requires an even more complex meta-language (.e.g.
meta-c++) in order to handle them at compile time.
Concepts are, for all intents and purposes, just a simple way of specifying type constraints.
Standard C++ allows you to have very specific type matches (unsigned long, Foo*, etc.) or very
generic types (typename anything), while Concepts add the ability to describe a type that
follows some particular contract (anything that can be added together, anything with a
set(string,int) method, anything with a default constructor, etc.).
However, the unwieldly error messages produced by C++ really aren't directly related to the
lack of Concepts in the current language. It's mostly just the compilers explicitly choosing
to be too verbose.
Say your code looks like this:
typedef vector<int> vint;
vint numbers;
numbers.add("foo");
You're going to get a huge error including things like std::basic_vector<int>::blah::blargh
etc. etc. What is happening is that the compiler is substituting in the actual type in the
AST/IR instead of the type specified by the user.
A compiler like Clang (eventually) will not have this issue. Instead of spitting out the base
template type and every child in the chain of derived templates, it can just say "vint
(typedef std::vector<int>) has no method matching ::add(char*)" and then maybe spit out the
other overloaded versions of ::add() a la GCC.
GCC and the other C++ compilers spit out huge error messages for templates because the
compiler authors just didn't care enough (or didn't think enough) to support cleaner error
messages. Nothing in C++ itself makes those cleaner messages impossible, or even all that
difficult.
C++0x to save the day
C++ appeals to serious writers of useful libraries, and users of those libraries.
There's nothing freaky here. The compiler can't know, by itself, that a template argument must
model a numeric type. Concepts allow the library writer to say so. Then, when a user tries to
use the template on something that isn't numeric, the compiler can say so right there, instead
of complaining that some expression down in the function body is ill-formed.
As noted above, macros in C have the same problem, but hardly anybody tries to do anything
ambitious with C macros; far fewer do it successfully, in part because of precisely this sort
of problem. In essence, a C++ library can extend the compiler, and compiler extensions need
their own error messages. Concepts gives library writers a practical way to provide them, to
describe the error in terms of the library interface, not in terms of details of the library
implementation, which is all a compiler could conceivably do by itself.
C++0x to save the day
instead of
std::basic_string<char, std::char_traits<char>, std::allocator<char>>
Sometimes C++ is faster
The debugging problem (function appears on one line) can be dealt with:
pipe the output of cpp to indent(1) and do a little bit of processing on
the preprocessors line markers (like: # 1 "/usr/include/stdint.h"). Gdb
now sees the function on several lines, and it's much easier to use the
debugger on them.
Sometimes C++ is faster
The fundamental advantage (inlined templates over macros) is that the compiler knows what's
going on. First, it can provide type checking, which with good design can prevent many usage
errors. Second, it can provide type-based dispatching, enabling different algorithms to be
used for different data structures. The optimal sort for a three-element array is
fundamentally different from that for one of unknown size. The optimal sort for a linked list
is conceptually similar to that for an array, but an entirely different code sequence.
Using template type matching, you can present the same interface for all these cases, and
generate appropriate code for each. Try doing that with C macros, or even Lisp macros.
You might provide macros for all the cases, under different names, but you probably wouldn't,
because you could only use them locally. Templates may be packaged cleanly and safely enough
to be generally usable, justifying more attention to their creation and distribution.
C++ may be faster in other places. In a deep call chain, where each caller must check and
dispatch on the return value of each call, using exceptions well instead can make the code
both faster and much more compact and readable. A taste of the difference may be seen in
numeric codes that rely on checking for Inf and NaN at the end of a computation; imagine
checking for all those conditions after each multiply!
Sometimes C++ is faster
Sometimes C++ is faster
Sorry, f., we're talking here about speed, hence compile-time dispatching. If you don't care
about runtime cost, all kinds (not to say "sorts") of notational convenience are easy to
support.
Sometimes C++ is faster
In CL, if you want compile-time dispatching, you just declare the types of the method
arguments using DECLARE. Then the compiler sees that and "hard-wires" the method dispatch.
Sometimes C++ is faster
Does CL support overloading? Or would you need to give the functions DECLAREd with different
argument types different names?
Sometimes C++ is faster
You would DECLARE the types of the arguments in the calling function. Then (in some
implementations, anyway) the compiler would see that the arguments to the generic function are
of a particular type, and do the dispatch at compile-time. No changes to the called GF
required.
Sometimes C++ is faster
I guess I should mention that this sort of compile-time dispatching is natural, without
macros, in ML variants and in Haskell. Such languages can be good for coding compilers, given
an implementation with a GC that respects cache locality. Recoding Gcc in Eager Haskell, it
might take years before you got anything useful, but you'd learn a lot, and have a better
sense of what the successor to Haskell should look like.
Sometimes C++ is faster
Mmh, well, I was hardly recommending that anybody rewrite GCC in Common Lisp. While I'm sure
someone could do it, the result would be a hell of a lot harder to bootstrap. :-)
C++ is sometimes faster to write. Our most expensive limitation is not the CPU. Remember that G++ is a full compiler, not something that outputs C. It has the opportunity to generate some code that you couldn't specify exactly the same way in C. Also, it might write faster code for some operations than your usual hand-coded alternative in C.
Converting GCC to C++
Personally I find writing in C++ to be considerably faster in many cases (i.e. not just sometimes). This is especially true when using the standard library as well Boost libraries (many of which are making their way to the next C++ standard, due in 2009).
Converting GCC to C++
Converting GCC to C++
Stuff in std::tr1
I find myself wondering just how many non-GCC compilers have e.g. std::tr1::unordered_map...
it can't be terribly common yet.
(This isn't a terrible problem because we could cross-compile bootstrap compilers for GCC's
targets and use those for future bootstrapping, much as is done currently with GNAT.)
Stuff in std::tr1
Stuff in std::tr1
True. I thought of the header file thing five seconds after hitting 'publish'... :/
Converting GCC to C++
I don't have a problem with this as a concept, but IMO Ian doesn't fully confront the most
difficult issue (IMO): bootstrapping. Although I can see where it's painful for the GCC
maintainers, it's very, very nice to not need anything fancier than a simple C compiler in
order to build GCC itself.
On the other hand, maybe this effort will convince the GCC folks to come to grips with
cross-compiled/encapsulated C++ and the issues it has (just for example, we have a
-static-libgcc flag, but no -static-libstdc++ ...)
Converting GCC to C++
If my understanding is correct, GCC uses GCC language extensions all over the place anyway,
and doesn't support being bootstrapped by anything other than GCC itself. Therefore, I don't
think they'd be losing much portability with regard to bootstrapping; they're not portable as
it is.
Converting GCC to C++
I don't believe this is true. Or rather, the build for GCC today builds a version of the C
compiler only, called "xgcc", and that bootstrapping compiler is used to build all the other
compilers in the collection (including the final "gcc" C compiler).
However, as far as I'm aware you do NOT need GCC to build the bootstrapping compiler.
It appears that this proposal would change the bootstrapping compiler to be C++ (xg++ or
similar) instead of C--and that implies a LOT more about the properties of the host system.
Either you have to have a very capable C++ compiler already installed on the system, or you
have to restrict your use of C++ in at least the bootstrapping section of GCC to a generic
subset.
Even though Ian says he wants to use a sensible subset of C++, the kinds of things he wants to
use (templates for example) are the kinds of things that have only relatively recently been
implemented in a sane manner among a majority of compilers.
I guess the "initial bringup" could be something like, build GCC 4.3 (or whatever the last
version written in C was) for your target, then use that to build GCC 5.0 written in C++.
Once you have that you can use it to build subsequent versions. Annoying but you only have to
do it once. That restricts you to using whatever C++ support appears in GCC 4.3 (or whatever)
but that seems like it should be pretty safe.
Still, I think there's more trouble here than Ian's slides indicate.
Converting GCC to C++
Bootstrapping is *by* *definition* a compiler compiling itself. But the first step (in this
case: building a gcc to build gcc, the so-called stage0 compiler) can be done with any ISO C90
compiler. So bootstrapping really doesn't limit portability.
Converting GCC to C++
Converting GCC to C++
It could be a good thing if the strategy changes from bootstrapping to cross-compiling, as it
implies more use of cross-compiling. Such a change seems likely to drive improvements to
cross-compiling, a subject near and dear to many in the embedded world.
Converting GCC to C++
One of the nicest outcomes of such a change
would be for GNU toolchain developers to feel
the pain of how bad gdb is at debugging c++
and perhaps feel motivated to improve it.
Jon, I didn't find any justification in the slides for the addition, "while, with luck, avoiding the C++ language's worst problems." If you don't personally like C++, that's your right and your handicap. Please don't project it onto Ian.
Snark
Snark
Jon, I didn't find any justification in the slides for the addition, "while, with luck, avoiding the C++ language's worst problems."
Snark
Thanks, Jon, for the apology, but I'd rather see the summary corrected. The charges Ian
answered in those slides were assertions, by others, that he was contradicting. We should not
take them as his own statements.
As an aside, he also didn't suggest that luck would be needed. He seems completely confident
that use of "features which are [not] worthwhile" and any unmaintainably complex constructs
can be deliberately eschewed. Since a chief goal is to replace unmaintainably complex C with
maintainable C++, I agree with his confidence.
Converting GCC to C++
As long as he doesn't go overboard this could be a really good thing, as
his experience with implementing the gold linker in C++ shows. C++ is
really the only modern language now designed to be a systems programming
language.
However gcc is a very large program... I wonder if he'll be able to move it
over to C++ with a clean design at the same time as gcc proper is being
developed?
There is a downside: Countering the "Trusting Trust" attack (as made well-known by Ken Thompson). There _IS_ a counter to this attack, but it requires have a second C++ compiler that isn't subverted the same way.
Problems: Fewer alternative C++ compilers, so harder to test against 'Trusting Trust' attack
Problems: Fewer alternative C++ compilers, so harder to test against 'Trusting Trust' attack
Problems: Fewer alternative C++ compilers, so harder to test against 'Trusting Trust' attack
> Can an ANSI C compiler build gcc?
Yes. Indeed, until a few years ago, GCC could be compiled with pre-ANSI compilers.
Thankfully, functions now have prototypes.
On the trusting trust issue, there's nothing to stop you starting with your own trusted C
compiler, compiling gcc/g++ 4.3, then using those to compile gcc/g++ 5.0.
Converting GCC to C++
The conversion of Gcc will be fundamentally different from gld -> gold. He will start by just
compiling Gcc with g++. Then he will re-write parts that have been problems in C to use
safer, more powerful C++ constructs.
Converting GCC to C++
Converting GCC to C++
It seems to me that trying to do it piecemeal line-by-line is a good way to end up with "C
code implemented in C++" which is pretty good way to get the worst of both worlds. Anyway his
idea seems to be driving at using RAII for the objects so that he can then start using
exceptions. But wouldn't he have to get destructors for *everything* before he uses any
exception? Otherwise he'll have old C stuff leaking all over the place whenever exceptions
cause it to skip the free call.
So there'd be a huge front-end cost to this no matter what. And given that it seems like the
argument for using C++ (that it's compatible with C) ends up being quite a bit weaker. GCC's
already got a ton of Ada code in the GNAT front end, perhaps he should look at remaking the
backends in Ada and making the front ends all self-hosting. Having to rewrite each component
would at least discourage the appeal of mushing the new C++ code into the old C design.
Converting GCC to C++
It's easy to speculate, but since he's actually doing the work, we'll see. If it's not really
better, it won't be merged.
Converting GCC to C++
'Start using exceptions' is probably quite a bit less important than 'start using typechecked
STL structures in place of rather horrible macros without type-checking'.
Converting GCC to C++
Converting GCC to C++
Converting GCC to C++
> C++ is really the only modern language now designed to be a systems programming language.
Uh? Only because you don't know the other ones..
The D language is also designed to be a system programming language.
And there is Lisaac (don't know much about it except that it has a syntax that I dislike)
http://isaacproject.u-strasbg.fr/
Converting Linux to C++
Maybe after he finishes this task he can go talk to Linus. I think the Linux kernel would
benefit enormously from the use of an OO language. There are a lot of interfaces and
inheritance being done in Linux.
Converting Linux to C++
I think it's fair to say that this would be
culturally impossible among the core kernel
maintainers. I'm very interested to know if
the same will be true for the gcc folks.
Converting Linux to C++
From what I understand most of the core gcc maintainers would prefer to use C++; the problem
is convincing the steering committee (in practice, RMS).
Converting Linux to C++
Given enough developers, all steering committees are ... "shallow" ;-) .
Remember EGCS?
Converting Linux to C++
Converting Linux to C++
Here is Linus's opinion on C++:
http://thread.gmane.org/gmane.comp.version-control.git/57...
Converting Linux to C++
Yes, Linus usually has strong opinions.
Sometimes I wonder a bit how he gets away with that.
Anyway, in the last years I didn't notice any portability problems with
STL. CMake uses STL and builds with more or less all C++ compilers on all
platforms you can imagine, including e.g. the HP-UX C++ compiler, MSVC 6,
both which are usually somewhat problematic:
http://www.cdash.org/CDash/index.php?project=CMake
Alex
Linus' comment on C++ in LKML
Converting Linux to C++
Converting Linux to C++
I completely agree with Linus!
Converting Linux to C++
I don't agree with Linus, but understand where he is coming from. I have seen a number of
examples where he has changed his mind after others have worked to educate him as well as
modified their own approaches to fix legitimate issues that he raised. This kind of
constructive head-butting is the norm in the kernel and makes it a much better piece of work.
Converting Linux to C++
How could this change? Take for example the most basic of his criticisms against C++ : C++ is
an ugly language! I absolutely agree and I don't see that changing.
Ugly programming language
Ugly programming language
C starts to get seriously ugly if you try to do in C the sorts of things you can do in C++
with almost no work at all: whether it be dynamically overrideable functionality a-la virtual
methods, or replaceable datatypes a-la STL, the *best* you can hope for is a mess of horrific
macros, and 'look in this header in peril of thy soul'.
Ugly programming language
Did you really have to state the obvious? It's obvious that I was talking about my personal
taste! And no, I wasn't talking about beauty like a picture, I think that C is a really
beatiful language fr me as a programmer and C++ is an absolutely horrible language for me as a
programmer.
Ugly programming language
This is not possible since almost all C code builds with minor
modifications also as C++, and is as such also C++ code.
Alex
Ugly programming language
Ugly programming language
Converting Linux to C++
I don't agree with Linus. This stance has held Linux driver development back... We have no
IOKit!!!
Converting Linux to C++
Converting Linux to C++
I think so too. When avoiding RTTI and exceptions, and using templates
carefully, I don't see why C++ should be slower than C.
While calling a virtual function is slower than calling a function
directly, usually functions are virtual for a purpose. I.e. if you
wouldn't have virtual functions, you would need some extra logic to chose
one of several plain C functions, e.g. a switch on a type field, or a
function pointer. When comparing calling a function pointer or a function
inside a switch to virtual functions there shouldn't be much difference
left.
E.g. the kernel of the embedded real-time OS eCos
(http://ecos.sourceware.org) is written in C++, and here the developers
surely cared about performance, memory footprint, predictability of
execution times etc.
About the IOKit: I didn't write a driver for it, but I used the API. The
API is C, and it uses a COM (?) object model, which makes it not too easy
to deal with. IOW, libraw1394 from Linux was much easier to use than the
quite overengineered (also plain C) API of OSX.
Alex
Converting Linux to C++
Converting Linux to C++
> The way the "virtual methods" are done in the Linux kernel have an interesting
characteristic that is not straightforward to simulate with C++: each instance has its own
virtual method table, instead of all the instances of a given class sharing a single one.
What do you mean by this? There's usually only one table per driver, declared static, some
"*_operations" struct with an "*ops" name. Each instance then stores its own pointer to this
table.
As far as I saw, kernel uses vtables pretty much the same way C++ compiler does them; correct
me if I'm wrong.
Converting Linux to C++
Converting Linux to C++
Converting Linux to C++
Embedded C++ is weird IMO.
They removed namespaces. Why ? There is no overhead involved, except
longer symbol names to resolve when loading shared libs, but I guess most
of the embedded systems EC++ is targeted at are just one static firmware
image, no shared libs supported.
Why did they remove the new style casts ? Ok, for dynamic_cast it's
clear, but the others would have been perfectly ok.
Why no templates ? If used carefully, they don't have to bloat the code
size. And without them having nice container classes is not possible.
Removing RTTI and exceptions is ok for embedded systems, other things
like removing multiple inheritance could be considered ok, but removing
the features mentioned above doesn't make any sense to me.
Alex
Converting Linux to C++
As ncm says in another reply, this is mostly taken from the poor state of C++ compilers in the
mid-90s, which also seems like it is where C++ still is in the mind of Linus, so, if anything,
this is a common mistake.
RTTI, exception support and global initialization are pretty much the only three things that
are significantly costly.
Converting Linux to C++
RTTI and exceptions only cost some ROM space, which is rarely constrained badly even on very
small embedded systems. "Global initialization", likewise, amounts to a bit of extra code to
run at startup, so ROM again, but only if you use it. RTTI and exceptions consume ROM space
whether the program uses them or not, so compilers often have a switch to tell the linker to
leave them out. A smarter linker, with some helpful annotations by the compiler, could leave
out the unused bits by default.
Anyway, well-designed code that uses exceptions is often smaller, even with the annotations,
than the code you have to write to avoid them. Faster, too.
Converting Linux to C++
Doesn't exception handling involve heap allocations? The amount of RAM is usually quite
constrained and in many cases a setup without any sort of dynamic memory is preferred.
Converting Linux to C++
Converting Linux to C++
Thanks, I did not know that. I might be giving a try for exceptions in the next embedded
project then.
Converting Linux to C++
That's disappointing. "Embedded C++" is a subset where they decided what to leave out not by
any objective criterion of usefulness or efficiency, but just on whether it was implemented in
really old compilers, and (therefore) known to people who stopped learning anything about 1991
or so. Netscape did about the same thing in their coding guidelines, for what could have been
practical reasons in the mid-'90s, but are actively idiotic today.
Converting Linux to C++
I agree. Back in the 90s, things changed all the time, and even when it was finally
standardized, it took a while before compliant compiler were widely available, but now, it's
just silly. If you leave out exceptions, C++ can be converted to C, so you could probably run
C++ code even on embedded platforms with only C support provided, with a small detour (this
could also be a way to bootstrap GCC).
Lazy initialization of global variable could be a bit iffy (doesn't it use special ELF
sections), I could see that being taken out.
Converting Linux to C++
Given that Linux already has tables of function pointers all over the place, I don't think
virtual functions in C++ would be any more overhead.
I would like to eliminate integer overflow bugs by defining a safe integer class and using it
everywhere. This is much harder to do in C.
Converting Linux to C++
virtual function are defined with two pointers:
pointer to the virtual table and than the pointer to function.
The advantage in C, is that virtual function can be changed
dynamically (see the VFS in kernel), on classes there are
defined at initialization.
Second point: no, you don't eliminate integer overflow bugs!
you make it more visible (so with less security problem),
but it doesn't eliminate the bug.
Converting Linux to C++
If you use a range-checked integer type everywhere, you convert a severe security hole
(integer overflow) into a less severe one (DoS), perhaps only a DoS of the attacker himself,
because you can detect the overflow and respond to it, rather than the overflow (possibly
indirectly) corrupting some other piece of state.
Converting Linux to C++
Sure, but 'fail fast' is a time-honored tradition to make things working well in embedded
development by making sure that the errors are detected as soon as possible (instead of being
a silent error), this is has a big benefit: your program becomes much more easy to
debug/maintain.
IMHO, ints with overflow exceptions should be the default: they could even be efficiently
implemented if CPUs supported trap on overflow instead of just setting a flag in the condition
code register: the compiler would generate code to set the CPU in the correct state at the
startup so that in case of overflow in the code, the CPU goes to a function generating an
integer exception so checked integer computations would happen as fast as unchecked one..
[ Sure exception generation would be slow if only because normally no exception should be
generated so most probably the exception generation code would have to be loaded from the disk
but this isn't a real issue only a (small) design constraint ].
Further, it is possible to statically guarantee no overflow exceptions if you specify the allowable range for integers, giving the programmer a quiet life (no possibility whatsoever that the operation will overflow) and allowing more efficient code (no need to check for overflow at runtime).
Converting Linux to C++
int a range 0 .. 100;
int b range 1 .. 5;
int c range 0 .. 20;
c = b + b; // no chance of overflow
c = a / b; // also guaranteed to be in range
c = check_overflow(a / 2); // can overflow and throw exception
c = a / 2; // rejected by compiler as unsafe, be explicit
This sort of thing is quite possible with C++ templates. Given the number of security holes and other bugs caused by overflow and underflow, it's amazing that using plain ints is still the 'ordinary' way to write code.
Converting Linux to C++
Do you have any more links to explain how to use templates for this kind of thing? It looks
very useful and interesting.
Converting GCC to C++
While I agree that gcc, in C, is a rather complex codebase, I do not
accept that re-writing it in something else eg C++, is any sort of pacacea
and reject the notion that imposing one of the paradigms, in this case OO,
is any real answer. There is much to be done in the Open Source tool chain
ie support for automatic paralliztion, static code analysis (eg Coverty),
backward debugging and code visualization is rather more important, and
immediately useful. We could also use better optisations for common
archetectures as clock speed increases are slowing.
I know some OO trained developers have a very hard time with non OO
designs BUT I submit there are many things that have high priority, but,
since this is open-source there is no reason why Ian should not try to do
this.
In my view code-visualizers are more widely applicable and useful.
Converting GCC to C++
> While I agree that gcc, in C, is a rather complex codebase, I do not
> accept that re-writing it in something else eg C++, is any sort of panacea
Who said this was meant as a cure-all? Have you even read the slides or the comments?
> and reject the notion that imposing one of the paradigms, in this case
> OO, is any real answer.
There is waaaay more than OOP in C++. Again, you'd know if you had looked at the slides of
the situations where Ian plans to make current object-oriented design in C it's C++
equivalent, and to replace a set of ad-hoc collections written in macros with the C++ template
equivalent.
Where they go from there is up to them of course, but no one is "imposing" a structure on
anything yet.
> There is much to be done in the Open Source tool chain
> ie support for automatic paralliztion, static code analysis
> (eg Coverty), backward debugging and code visualization is rather
> more important, and immediately useful.
OpenMP (parallelization) is already implemented in gcc, and gcc can attempt some
auto-vectorization anyways. The field of problems this is useful for is not exactly
super-wide however. I don't see how you expect gcc to fix the other three issues. static
analysis and code visualization require a good parser but that's already a solved issue right
now (although perhaps the gcc parser can be factored out at some point). Or do you expect the
gcc developers to drop what their doing to write some backward debugging patches for gdb?
> I know some OO trained developers have a very hard time with non
> OO designs
Is this seriously the hold up that people have with C++? It hasn't been "C with Classes"
since like 1984 or something like that. There's already OOP design in the current C-based
gcc, and that is not nearly all that C++ is useful for.
Converting GCC to C++
Thank you for your reply.
The basic point I sought to make was that recoding in C++ was unlikely
to bring great gains in functionality, and that is where the tool chain
needs most work, but this is the Open Source world so if Ian wants to work
on this it is __his__ choice.
Secondly, C++ is a complex language which you often see miss-used by
programmers who have drunk too much OO kool aid.
Finally, the other things do need much work, we need auto-parallelization,
see the Berkeley HPC paper, code analysis, in the compiler, leads to
better disagnostics and optimisation and extended support for the
registers and extended instruction sets of ???86_64 could lead to major
performance gains.
Desktop HPC is already important to CAD/simulation applications.
Converting GCC to C++
Curiously enough, C++ has turned out to be essential to making desktop HPC workable. The
VSIPL++ library enables users to write array expressions just as they appear in the textbooks,
and automatically distributes computations to available processors. It uses OpenMPI (or
certain similar libraries) underneath, but almost entirely conceals their operation. Users'
only concessions to computational reality are in designating array organization -- row-major,
column-major, tiled -- and inserting the occasional copy a from an array organized one way to
another organized differently. This is a major advance.
What is most significant is that VSIPL++ achieves all this with no assistance from
non-portable compiler extensions; it's pure Standard C++. Essentially, C++ is powerful enough
that library writers can write their own compiler extensions, portably.
Of course VSIPL++ itself doesn't help Gcc, directly, but the power of the language can be
brought similarly to bear on compilation problems.
gcc core
Right now, gcc-core is a C package for an -excellent- C compiler. And it is possible to build
a full toolchain, system (even with graphics and full X11/GTK) just with C programs.
This move, for one, will bring C++ into the core (and the standard libraries).
Will it still be possible to compile gcc so that it only supports the C frontend?
I dunno, a C compiler in C++, doesn't seem like such a good idea. It's not like the project is
starting now; most of the hard stuff is already there in gcc, in good old C and it works fine.
Maybe Ian should rewrite gold in C...
Anyway. This is just a branch. I believe (since "The FSF doesn't like C++ and FSF doesn't
write the code", according to Ian, and "Google likes C++ and google pays Ian"), we will have a
fork and it will need some real advantages in the gcc-cxx branch before we are convinced that
it *is* worth making it the default/official gcc branch.
Anybody knows the size of the core binary in the standard gcc and in gcc-cxx?
gcc core
Gcc and Gcc-cxx binary sizes are about the same, because the code is about the same. All the
"good old C" remains, and will remain, just compiled with the C++ compiler. As noted in the
slides, though, Ian is incrementally replacing the "bad old C". There is a plenty of "bad old
C" in Gcc, ripe for replacement, but the real goal is to make it easier and safer to improve
the program with new features and bug fixes.
If you think gold should be re-implemented in C, go ahead and give it a try. Ian is busy.
This is depressing
This is depressing
Yeah, but GCC-in-ocaml would be a big job, basically a complete rewrite.
This is `GCC with stuff now implemented using annoying macros moved to C++
constructs instead', an incremental improvement.
This is depressing
Not so
If you're writing a toy compiler, a language designed for writing compilers is a good idea.
If you're writing a real compiler, you are better off with the most powerful production
language you can find, because you will encounter every kind of problem.
Gcc is very far from a toy. Fortunately, C++ is the most powerful production language yet
devised. (Its template system, incidentally, was designed to match ML's capabilities.) You
may expect the Gcc code to use that power increasingly as the uglier, less maintainable parts
are replaced with clean C++.
Not so
Not so
The C++ template system was explicitly modelled on ML's pattern-matching.
Stroustrup says as much in D&E, and others who know have confirmed it.
(The *syntax* is insane, agreed, or at the very least really ugly.)
Not so
Not only that, there is a library you can download, just a set of ".h" files, that allows you
to write Haskell programs (with amazingly small difference in syntax from official Haskell) in
C++, and compile them with any standard C++ compiler.
See http://www.cc.gatech.edu/~yannis/fc++/fcpp-lambda.pdf
http://www.cc.gatech.edu/~yannis/fc++/
This would have been impossible if C++ templates were not isomorphic to Haskell, thus to ML.
The conventional technical description of C++ template syntax is that it is "unfortunate". It
gets substantially better in C++0x. Obviously one can do better when starting with a clean
slate.
Not so
template <class F, class List> struct Map;
template <class F> struct Map<F,NIL> {
typedef NIL Result;
static inline Result go( const NIL& x ) { return x; }
};
template <class F, class H, class T> struct Map<F,CONS<H,T> > {
typedef CONS<typename F::template Go<H>::Result,
typename Map<F,T>::Result> Result;
static inline Result go( const CONS<H,T>& x ) {
return Result( F::template Go<H>::go( x.head ),
Map<F,T>::go( x.tail ) );
}
};
let rec map f = function
[] -> []
| x::xs -> f x :: map f xs
../FC++.1.5/list.h: In member function fcpp::IRef<fcpp::impl::Cache<T> > fcpp::impl::ListHelp<T, F,
fcpp::impl::List<T> >::operator()(const F&) const [with T = long long int, F = Hamming]:
../FC++.1.5/list.h:95: instantiated from fcpp::impl::List<T>::List(const F&) [with F = Hamming, T =
long long int]
tw_hamming.cc:44: instantiated from here
../FC++.1.5/list.h:434: error: dependent-name fcpp::impl::Cache::CvtFxn
is parsed as a non-type, but instantiation yields a type
../FC++.1.5/list.h:434: note: say typename fcpp::impl::Cache::CvtFxn if
a type is meant
../FC++.1.5/full.h: At global scope:
../FC++.1.5/full.h:227: warning: fcpp::<unnamed>::bind1and2and3of3
defined but not used
../FC++.1.5/prelude.h:1465: warning: fcpp::<unnamed>::NOTHING
defined but not used
../FC++.1.5/prelude.h:1501: warning: fcpp::<unnamed>::empty defined
but not used
Not so
You get the error messages because the code presented in the paper is not meant for
compilation, but for exposition. Anyway, that's underlying implementation; if you looked in
the source code of your OCaml compiler you would find code no prettier. The FC++ library
amounts to a compiler extension, one not possible in languages weaker than C++.
Code to implement Map by *using* what is found in the header files doesn't look like that; it
more closely resembles the OCaml code you present, which indeed is the purpose of the FC++
library. But my point is not to promote FC++. I mentioned it only as concrete demonstration
that C++ is indeed in the same expressive family as Haskell and the MLs, with unfortunate
syntax but with industrial-grade support and defined by an ISO Standard.
ISO Standard C++0x will have support for "concepts", the equivalent of Haskell's "type
classes"; type inference, making it unnecessary to spell out elaborate type names; and an
explicit lambda construct. These make it much easier to apply techniques found in FC++
without depending so much on libraries to package them.
Not so
