Snowball: Quick introduction |
Links |
You can use this site at a number of levels: - You can look at the stemming algorithm definitions themselves, and use them as templates for coding your own versions of stemmers in the computer language of your choice. - You can use the various ANSI C and Java stemmers in programs of your own, without bothering yourself with the Snowball system that generated them. To do that, download either the C or the Java version of the libstemmer library, and follow the instructions contained in the README files within these tarballs. The tarballs also contain simple example programs which allow you to run the stemmers from the command line. - You can get involved in Snowball itself. This is particularly worthwhile if you want to adjust the stemmers or develop new stemmers. A typical reason for adjusting the stemmers is that you are working with a different encoding of accented letters from the ISO Latin I encoding assumed in most of the scripts here. Then you need to make your own version of the Snowball compiler and work with the Snowball scripts.
|