[Snowball-discuss] The Norwegian stemmer algorithm

From: Ask Solem Hoel (ask@gan.no)
Date: Tue Nov 27 2001 - 13:19:30 GMT


Hello there (o:

I'm making a port of the scandinavian stemmer algorithm
for perl. You can fetch it from:

http://www.unixmonks.net/~ask/Stemmer-Norwegian-0.3.tar.gz

There is one thing I can't understand, though,
on the description of the algorithm you say:

> R2 is not used: R1 is defined in the same way as in the German
> stemmer.

And on the German page, it says:

> R1 and R2 are first set up in the standard way (see 3.1), but then R1
> is adjusted so that the region before it contains at least 3 letters.

Where is "3.1" ? :-)

If you unpack that tarball and try to run it against the diff.txt:
% perl stemmer.pl diffs.txt | wc -l
you'll see that 120 out of 20628 differs.

Why???

I'd guess this has something
to with the snowball thingie:

> $p1 = limit
> goto v gopast non-v setmark p1
> try ($p1 < 3 $p1 = 3)

What does this do?

Thanks:)

-- 
/ Ask Solem Hoel        | GAN Media             \
: +47 48054613          | +47 22707439          :
\ www.unixmonks.net     | www.gan.no/media      /

_______________________________________________ Snowball-discuss mailing list Snowball-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/snowball-discuss

_____________________________________________________________________ VirusChecked by the Incepta Group plc _____________________________________________________________________



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:40 BST