Re: [Snowball-discuss] Mismatch between vocab.txt and output.txt

From: Olly Betts (olly@survex.com)
Date: Mon Oct 14 2002 - 14:27:01 BST


On Mon, Oct 14, 2002 at 04:18:27AM -0600, Martin Porter wrote:
>
> >The first disagreement for finnish is that the stemmer produces
> >"aachenin" but output.txt contains "aachen".
>
> I think I don't undertand. Are you saying the stemmer actually stems
> aachenin to aachenin while the file output.txt implies that it stems
> aachenin to aachen?

Exactly.

> My Finnish stem.c stems aachenin to aachen, and it is the same as the one on
> the Wesite, which is the same as the one in the tarball on the website (I
> downloaded both to check.)

I generated my stemmers from the ".sbl" sources, but the difference from
the finnish stem.c on the website are just in the function names. Most
odd - I'll see if I can work out what's going on.

Cheers,
    Olly



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:43 BST