RE: [Snowball-discuss] porter2 question

From: Reetz, Wendy (wreetz@greenapple.com)
Date: Fri Oct 04 2002 - 14:19:02 BST


Martin,

Ah, I missed that one, thanks.

No, I hadn't looked for a php version on the Snowball site. I was
already done with the first algorithm before I went on the site, I
really only went there in search of a set of words and their appropriate
stems as test data.

I will look into though.

Thanks,

Wendy

-----Original Message-----
From: Martin Porter [mailto:martin_porter@softhome.net]
Sent: Friday, October 04, 2002 8:51 AM
To: Reetz, Wendy
Cc: snowball-discuss@lists.tartarus.org
Subject: RE: [Snowball-discuss] porter2 question

> ... no 'OU' stripping in the 4th
>step any longer. Was that intentional?
>
>Wendy

Yes. In the old stemmer -s is removed from -ous early on so -ou- is
removed
later to compensate, but in the new stemmer -s is not removed after -u-
(cactus, ferrous, locus etc) so -ous survives as an ending until step 4.

There has been some php work on the stemmers by "dark panda": You can
find
the relevant correspondence if you put "php" in the front page search
box -
but maybe you have found it already.

Martin



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:43 BST