Hi folks,
This is what the Porter 2 definition
(http://snowball.tartarus.org/english/stemmer.html) has to say about a part
of Step 4:
> Search for the longest among the following suffixes, and,
> if found and in R2, perform the action indicated.
>
> ... (removed the non-relevant part of step 4)
>
> ion
> delete if preceded by s or t"
When I feed the word "unquestionably" to my stemmer, it returns "unquest",
while the provided sample list of stemmed words shows the word being stemmed
to "unquestion" (and so does
http://snowball.tartarus.org/demo.php?words=unquestionably)
When step 4 kicks in, this is what the word looks like:
u n q u e s t i o n
| |
| R2------
R1--------------
According to the Porter2 definition described on the site, ion should be
removed because it's preceded a "t", and "ion" is located in R2
Has the step 4 rules been changed, or has the provided dictionary/stemmed
list (and demo) not been updated for the Porter2 method? What should I do?
Thanks
Best regards,
Håvard Lindset
This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:45 BST