RE: [Snowball-discuss] porter2 question

From: Reetz, Wendy (wreetz@greenapple.com)
Date: Fri Oct 04 2002 - 13:38:01 BST


Martin,

Thanks for the info. I'll take the 'b' out of my list. :-) I'm about to
do the exceptions, I love this algorithm, very cool! The only other
thing I noticed different was there was no 'OU' stripping in the 4th
step any longer. Was that intentional?

Wendy

-----Original Message-----
From: Martin Porter [mailto:martin_porter@softhome.net]
Sent: Friday, October 04, 2002 5:38 AM
To: Reetz, Wendy
Cc: snowball-discuss@lists.tartarus.org
Subject: Re: [Snowball-discuss] porter2 question

Wendy,

You have spotted an error! 'b' should not be in the valid_LI list, and I
have now removed it. (See the updated page on the snowball website.)

This does not affect the working of the Snowball program however: the
'among' expression goes for the longest string, so 'bli' was overriding
'li'
when 'li' was preceded a 'b'. In other word, 'b' in the valid-LI list
was
redundant.

Thank you for your help,

Martin

At 02:28 PM 10/3/02 -0400, Reetz, Wendy wrote:
>I just finished implementing the porter stemmer in PHP, then discovered
>porter2. So, I'm upgrading. :-)
>
>On step 2, though, you have added ' bli --> ble ' as well as 'li
>preceded by 'b' --> remove li '
>
>so, for words ending in 'bli' which is more correct? removing the li or
>change the i to e?
>
>I know you say to take the longest first, but is that the intended
>action on 'bli' vs. 'li' preceded by a b?
>



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:43 BST