Alternative compiler suites.
GCC is the proverbial 900-pound gorilla in compilers. I mean what's not to like about it? It's free. It's open-sourced. It mostly follows relevant standards (at least for C and C++). It has a lot of support.
Well how about this for reasons not to like it? It's huge and bloated. It's a bitch to maintain and it's a pain in the ass to extend. And it's not actually all that great code generation-wise. It isn't terrible compiler by any means, but it isn't a great one either. Its aging code base (and its notoriously inflexible owner) is showing and as a result it really needs an overhaul or replacement.
The LLVM project has one possible such replacement. I'll let you look at the web page for details, but in effect LLVM is a combination of a "low-level virtual machine" (get it?) as an intermediate representation for a compiler combined together with a host of tools for code generation, optimization, JIT, etc. in a lovely crunchy-on-the-outside, gooey-on-the-inside package. And LLVM also happens to come with a port of GCC that replaces the aging, rigid, incomprehensible GCC back-end with the LLVM tool suite. The purported result of all this is the ability to get all of LLVM's lovely capabilities with the de-facto standard C/C++ compiler front-end for most F/OSS (and many commercial!) products.
After experimenting for a while with LLVM (going way back to, like v1.3 or something -- a long time ago, in other words) I came to the conclusion starting around v1.9 that LLVM was mature enough to use for serious work. With 2.1 I found my first use for it: I installed LLVM-2.1 with LLVM-GCC-4.0.1 and used this to compile Erlang/OTP R11B-5 after having used a standard compile with GCC 4.1. I derived two impressions from this:
- The actual build of Erlang seemed to take less time than before.
- The resulting binaries seemed snappier somehow -- more responsive.
This week I found the other opportunity. Since, I noticed, both LLVM and Erlang/OTP had been upgraded (to 2.2 and R12B-1 respectively) from what I had installed, I spent some time over two days upgrading and installing both. After a bit of an adventure getting LLVM2.2/GCC4.2 working (all of it was my fault, not the developers') and a test run on building Erlang/OTP R12B-1 I got ready. I decided my next test run would be Ruby 1.9.
Now why Ruby 1.9 instead of Erlang? Well, first, I've been itching to upgrade my Ruby install for a while. Second, unlike Erlang, Ruby comes with a pretty comprehensive set of tests. I'd be able to not only time the builds (which aren't really all that important, all things considered) but I'd also be able to stress-test the resulting executables in that Ruby's tests all use, well, Ruby to run. So off I went and downloaded the Ruby source trunk with Subversion. After that, I followed these steps to get repeatable results.
- In the ruby-1.9 trunk I fired up autoconf to get a configuration script.
- I then took a snapshot of this by turning the SVN checkout into a darcs repository.
- I made two local branches -- ruby-1.9-llvm and ruby-1.9-gcc -- and corrected for darcs' obnoxious habit of losing permissions. (The configure script and one tool script needed to be changed back to executable.)
- In ruby-1.9-llvm I typed the command CC=llvm-gcc ./configure and made sure that everything was properly configured.
- In ruby-1.9-gcc I just typed ./configure and made sure everything was properly configured.
- Then, in each directory sequentially, I typed time make and waited for completion.
- After this, again in each directory sequentially, I typed time make test and waited for completion.
- Finally I typed in each directory time make test-all and waited for completion.
The GCC make had the following time results:
real 3m56.694sThe LLVM-GCC make had these:
user 3m3.799s
sys 0m12.461s
real 3m34.526sThis puts GCC at taking 10/13/3% more real/user/system time respectively for the same task: building Ruby 1.9.
user 2m42.502s
sys 0m12.081s
The GCC make test gave these results:
FAIL 7/806 tests failedThe LLVM-GCC make test gave these instead:
real 0m22.548s
user 0m10.741s
sys 0m2.976s
FAIL 7/806 tests failedNote that the fail count is the same in both cases and eyeballing the output has both failing in the same place and in the same way. (It is worth noting that I've never had a Ruby distribution I've compiled pass the test suite -- there's always a few fails.)
real 0m21.186s
user 0m10.245s
sys 0m2.624s
Again, quick calculation shows that GCC-generated code takes up 6/4/13% more real/user/system time than does LLVM-GCC-generated code.
That is the good news.
The bad news comes with the make test-all results. Less than two minutes into the comprehensive test suite the LLVM-GCC version of Ruby 1.9 dies with the following message: "Illegal instruction (core dumped)". Later it tells me the test failed with "error 132". This is, as you can see, not a very useful message since it's not really helping me locate where the error is. It's just taunting me with "there was an error but we're not going to help you solve it". Meh. It's F/OSS. What can you expect? Professionalism?
This does not happen with the GCC build. It chugs merrily along, passing (and failing!) more tests, until it hits this speed bump about eight and a half minutes in: "./home/michael/Development/Ruby/ruby-1.9-gcc/test/ruby/test_m17n.rb:668: [BUG] Segmentation fault". This message is followed by a whole bunch of numerical vomit on the screen purporting to be a stack trace.
So that's where things stand as of now. LLVM-GCC is showing promise as a serious contender for compilation, but the current GCC port still seems to have a few bugs to iron out. I'll be trying to see what it is about that one test that's making core dumps. It may be a quick fix to Ruby or a slower change to LLVM-GCC. Either way, I'm mostly pleased with the outcomes, especially since the LLVM version and the GCC version both fail the detailed test suite equally spectacularly -- just in different locations.
