Re: [Snowball-discuss] Local variables for snowball

From: Olly Betts (olly@survex.com)
Date: Mon Sep 11 2006 - 18:53:05 BST


On Mon, Sep 11, 2006 at 01:50:26PM +0100, Olly Betts wrote:
> I did try some performance measurements, but I seem to get a lot of
> variation in timings on this box even between runs of the same code
> (I suspect because it's an Athlon 64 which underclocks when idle and the
> clockspeed varying daemon doesn't respond to the start of the test
> program in exactly the same way each time).

I've been playing with valgrind's cachegrind tool. It effectively
measures estimated cycle counts and simulated cache behaviour for
the code so at least you get a repeatable answer, even if it's
somewhat fictious (but then the real answer will depend on the CPU
and cache configuration anyway).

This seems to confirm that any local variable speed-up is dwarfed
by other factors, for the English stemmer at least. It particular, for
the English stemmer around a third of the time is spent in
find_among_b(), so that's a good candidate for optimising (if only I
understood it!)

I notice the runtime functions are currently compiled without
optimisation, which isn't good if we spend more than 1/3 of
our time in them. I think they get built by an implicit make
rule, so adding "-O2" to CFLAGS in GNUmakefile will do the job
but perhaps there's some reason why it's not there?

Incidentally, the explicit -O4 for the generated code is a bit odd -
there are no optimisation levels for GCC above -O3.

If you really want to optimise well under GCC, it might be worth
investigating automatically building the code with -fprofile-generate
then running the sample word lists through, and then rebuilding with
-fprofile-use. Trying this by running the English word list through
stemwords gets me a 3.5% reduction in the estimated cycles from
cachegrind.

Incidentally, I've also now patched the Java code generator to support
local variables (it was easier to do than the C one actually).

Cheers,
    Olly



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:48 BST