Hi there:
I think I found a bug, but I can't find any way to report it described in http://snowball.sourceforge.net/
In the Spanish stemmer, step 3 "residual suffix" it says:
e é delete if in RV, and if preceded by gu in RV delete the u
I can't read Snowball, but I think the implementation is more like "if preceded by 'gu', delete the u, where 'u' must be in RV, but 'g' can be outside RV".
I've tested the ANSI C stemmer, and checked the example file spanish/diffs.txt, and word "pague" is stemmed as "pag", which means the 'e' has been removed, and the 'u' has also been removed. However, the 'g' in the 'gu' suffix is not in RV.
Am I correct?
Thanks,
Ruben
_______________________________________________
Snowball-discuss mailing list
Snowball-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/snowball-discuss
This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:41 BST