Hi Snowball developers,
1. many thx for this nice piece of code.
2. I have an improvement suggestion for the JAVA Snowball Stemmer: If you
String name = ...;
My Question: Why doesn't the SnowballProgram class has an abstract stem()
public abstract class SnowballProgram {
...
This will reduce the code shown above (and also makes it faster, reflection
String name = ...;
want to use a SnowballStemmer you have to prepare something like this:
Class stemClass = Class.forName("org.tartarus.snowball.ext." +
name + "Stemmer");
stemmer = (SnowballProgram) stemClass.newInstance();
stemMethod = stemClass.getMethod("stem", new Class[0]);
stemMethod.invoke(stemmer, null);
method like
...
/**
* Every derived <CODE>SnowballProgram</CODE> has to implement this
method
* to initialize the appropiate stem algorithm.
*/
public abstract boolean stem();
}
is slow!):
Class stemClass = Class.forName("org.tartarus.snowball.ext." +
name + "Stemmer");
stemmer = (SnowballProgram) stemClass.newInstance();
stemmer.stem();
This also the standard Design Pattern "Abstract Factory" [Gamma et al, 1996]
and implicitly used by JAVA.
3. I would suggest to rename the stemmer classes regarding to ISO country
(http://www.iso.org/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/list-en1.html)
and language (http://www.loc.gov/standards/iso639-2/englangn.html) codes, so
that I can identify them via a java.util.Locale object, e.g.:
germanStemmer.java ==> de_DE.java or de_DE_Stemmer.java
Pls let me know what you think.
Best regards
Jens
This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:47 BST