Re: [Snowball-discuss] Problems eliminating stop words

From: richard@lemurconsulting.com
Date: Fri Sep 22 2006 - 18:00:53 BST


On Fri, Sep 22, 2006 at 06:21:00PM +0200, Alfredo Favenza wrote:
> I have some problem eliminating stop words by using italian stemmer
> (java version).
> In the output text file I notice that the algorithm doesn't eliminate
> stop words like il, lo, la, gli and others.
> Someone can help mer about this problem?

The stemmers do not perform stopword removal.

We provide files of suggested stop-words (see
http://snowball.tartarus.org/algorithms/italian/stop.txt for the italian
one), but these are not integrated into the stemming algorithms. It is up
to you to write code to perform stop-word removal separately.

-- 
Richard



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:48 BST