Hello there (o:
I'm making a port of the scandinavian stemmer algorithm
for perl. You can fetch it from:
http://www.unixmonks.net/~ask/Stemmer-Norwegian-0.3.tar.gz
There is one thing I can't understand, though,
on the description of the algorithm you say:
> R2 is not used: R1 is defined in the same way as in the German
> stemmer.
And on the German page, it says:
> R1 and R2 are first set up in the standard way (see 3.1), but then R1
> is adjusted so that the region before it contains at least 3 letters.
Where is "3.1" ? :-)
If you unpack that tarball and try to run it against the diff.txt:
% perl stemmer.pl diffs.txt | wc -l
you'll see that 120 out of 20628 differs.
Why???
I'd guess this has something
to with the snowball thingie:
> $p1 = limit
> goto v gopast non-v setmark p1
> try ($p1 < 3 $p1 = 3)
What does this do?
Thanks:)
-- / Ask Solem Hoel | GAN Media \ : +47 48054613 | +47 22707439 : \ www.unixmonks.net | www.gan.no/media /_______________________________________________ Snowball-discuss mailing list Snowball-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/snowball-discuss
_____________________________________________________________________ VirusChecked by the Incepta Group plc _____________________________________________________________________
This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:40 BST