GCJ - past, present, and future
The traditional Java platform is clearly not an ideal system, especially when combined with the traditional GNU system, but it is not too bad. The essential features seem to be good ones. Lots of Free Software is already written in the Java programming language so a free system compatible with the Java platform would be convenient for many hackers. GCJ is an extension of GCC and facilitates integration with other languages supported by GCC. GCJ 4, part of GCC 4.0, adds more features to easily integrate programs written using the GCJ development environment with the rest of the GNU platform while being even more compatible with the traditional Java platforms then previous releases. GCC 4.0 is scheduled to be released around April 15.
GCJ design history
Originally GCJ was designed as a “radically traditional” compiler for the Java programming language. It is an AOT (Ahead Of Time) compiler which automatically uses every GCC optimization available during compile time for a given architecture and produces binaries or (shared) libraries for the given platform. These programs run at full native speed without needing any interpreter or JIT (Just In Time) compilation. GCC is available for a large number of architectures and platforms so compiling directly to native code using the GCC back-ends makes programs written with GCJ much more portable then the traditional (proprietary) Java platform. This radically traditional approach makes all normal GNU tools like GDB available to the programmer writing code in the Java programming language just like when programming in any other language supported by GCC.
Thereafter, support for generating and interpreting byte code .class and .jar files was added. This made GCJ more compatible with traditional applications written in the Java programming language that are compiled to byte code. GCJ can be used in various modes:
- Compile and link .java source files to binaries, .o or .so files.
- Compile and link .class or .jar byte code files to binary.
- Compile .java source files to .class byte code
files (
gcj -C). - Interpret .class or .jar byte code files during
runtime (
gij).
The byte code interpreter is included as part of the standard runtime libgcj and can be used by programs to switch between interpreting byte code and executing natively compiled code on demand. So not all of the program has to be completely interpreted or completely compiled ahead of time at the same time.
To facilitate integration with code written in other languages, GCJ defines the CNI (Compiled Native Interface). CNI makes it easy to mix and match code and classes written in C, C++ and Java by allowing you to write some methods of a class in C++ and to catch and throw exceptions directly to and from parts of the program written in different languages. GCJ also support the more traditional JNI (Java Native Interface) for using code written in C from your programs.
Anthony Green posted the original design document for GCJ from 1998.
Drawbacks of the GCJ 3.x approach
GCJ 3.x provides a good “better than Java” development environment that allows tight integration with the rest of the GNU platform. But it has disappointed some traditional Java programmers. The possibility to mix and match native code with byte code in the compiler and libgcj runtime makes GCJ very flexible. But falling back to interpreting byte code doesn't really take full advantage of the whole “radically traditional” approach. Especially programs using advanced byte code based class loader tricks used to work slowly because they fell back to using the interpreter during runtime.
There are GCJ extensions to add support for using natively compiled
code all the time. But programs had to be adapted to use these
extensions. Instead of using .jar files containing byte code
definitions of new classes programs would have to use a new URL scheme
(gcjlib:) for their URLClassLoader uses. The
first “Fast Free
Eclipse” port to GCJ was done this way. The source code of
the plugin loading mechanism was adapted to search for natively
compiled plugins in shared library .so files besides ordinary
.jar byte code files. There was even a moderately popular
project, rhug, that
maintained a lot of patched versions of traditional free Java programs
that were adapted to gcj's view of the world. But these
patches were almost never adopted upstream and the maintenance of
these forks took a lot of time. So the benefits of the GCJ approach
were only seen by programs written explicitly for it, but not by
traditional Java programs.
One of the main goals of the GCJ 4 effort was to bring all the advantages of the “radically traditional” approach to any program written in the Java programming language without needing any application-level changes.
GCJ 4 enhancements
Probably the most visible enhancement of GCJ 4 comes from merging the libgcj runtime with the GNU Classpath core class library project. By collaborating with other free runtimes like the traditional kaffe environment and around 20 other projects, GCJ 4 is able to offer a core class library comparable to JDK 1.3 or 1.4. The collaboration of all these projects on a common core library implementation means that a lot of the libraries needed by applications, except for advanced Swing, Corba and sound usage, are available out of the box. Kaffe, for example, is being used by the Apache project to track the build of most of the jakarta projects using their Gump auto-builder.
The other big change is the addition of the -findirect-dispatch switch to the compiler. Using that option causes GCJ to generate native code for classes and methods that follow the precise same binary compatibility rules as described in the Java Language Specification. This means that native compiled code can now be used everywhere, even in the most tricky class loader situations, where previously the program would fall back to interpreted byte code. At the 2004 GCC Summit Tom Tromey and Andrew Haley described this new binary compatibility ABI for GCJ in more detail.
The new binary compatibility (BC) ABI makes it possible to transparently
compile programs to native code using gcj
-findirect-dispatch without having to change the application
source code or even the build process. To map byte code to GCJ
compiled native code, GCJ 4 introduces gcj-dbtool. This tool
is used by the packager during deployment of the application or
library to create a database mapping the bytecode of a class to the
native code during runtime. Programs can use different databases
using the gnu.gcj.precompiled.db.path system property. The
databases make it possible to create a cache of all native compiled
code that can be shared by different programs installed on the system.
The How
to BC compile with GCJ GCC wiki page has examples.
This approach is used by the native Eclipse packages in Fedora Core 4. No changes to the eclipse code base are necessary anymore and, after the project is bootstrapped, all resulting .jar files are BC compiled. To almost completely automate this process, Thomas Fitzsimmons created java-gcj-compat. A collection of wrapper scripts, symlinks and jar files that provide a Java-SDK-like interface to this new GCJ 4 tool set.
Future plans
The -findirect-dispatch switch can currently only be used for byte code and not in combination with CNI (JNI is already supported). This limitation currently prevents parts of the core class libraries from being BC compiled. Lifting this restricting will facilitate more integration with GNU Classpath.
With GCJ and GDB a programmer can step through native C, C++ and Java source code using the same tool. Traditional Java developers are more used to JDWP (Java Debugging Wire Protocol) for debugging their applications. Eclipse comes with built-in support for JDWP. Work is in progress to provide JDWP debugging support for the different execution mechanisms. This code will also be shared with the GNU Classpath project.
Benchmarks show that GCJ is comparable (sometimes faster, sometimes slower) to traditional execution mechanisms for Java programs. But GCJ currently doesn't really take advantage of the new GCC 4.0 Tree SSA optimizer framework. For 4.1 the GCJ developers hope to add a couple of GCJ specific optimizations.
Tom Tromey is currently working on GCJX, a new GCC frontend that will include support for the new 1.5 language additions, such as generics. And the GNU Classpath project has a separate branch for the core class libraries that depend on the new 1.5 language additions.
Escaping the Java Trap
GCJ 4 is the result of seven years of work by a large and active community of Free Software hackers. This new version is complete enough to replace most interesting uses of the proprietary Java platform. It adds a whole new set of core libraries and adds some new features to help integration with the rest of the GNU platform. Upcoming versions of some GNU/Linux distributions will use GCJ 4 to provide much more Java-based Free Software, including Eclipse, Jonas, OpenOffice.org 2, Tomcat and the Jakarta libraries. There is also a great deal of free software to integrate with traditional GNU/Linux distributions provided by the JPackage project. Both Debian and Fedora are working with the jpackage hackers to support more of these packages “out of the box”.
All this doesn't mean that we have escaped the Java trap yet. As pointed out by Richard Stallman in “Free But Shackled - The Java Trap” we have to actively work together to keep code safe and free. It looks like the main target projects for GCJ 4 (Apache Jakarta, Eclipse and OpenOffice.org 2), have all reacted positively to the feedback and patches provided to support free alternatives to the Java platform. The fact that the changes requested were for making the projects more portable ("don't use undocumented com.sun internal classes") rather than requests to dramatically change the code, (core) libraries used or build infrastructure has helped a lot. But the above projects were already free software projects at heart. It remains to be seen if other more traditional java projects will adapt so easily to support GCJ 4 out of the box.
| Index entries for this article | |
|---|---|
| GuestArticles | Wielaard, Mark |
