I did that search a few months back, and doing it again today, I
think I made a mistake. I found a page like this:
Which is on tartarus, but isn't necessarily snowball-related, at
As far as Japanese stemming goes, I can contribute linguistic
knowledge, and pseudo code, but I don't have any experience writing
stemmers, and I don't 'speak' snowball. Would anyone else out there
be interested in collaborating on a stemmer for Japanese?
In other words, I could probably brute force one, but it would not be
rational or efficient.
On Jan 26, 2007, at 4:04 AM, Martin Porter wrote:
> I don't know of particular work in this area, but am broadly aware of
> the problems, which are (a) segmentation of text into words and (b)
> normalisation, of which something like stemming forms a part. The
> to go for solutions is no doubt Japan itself. There are commercial
> solutions in the West though, with proprietary software from companies
> like Inxight and Teragram. Among all the major languages, Japanese
> presents the worst problems.
> I don't believe the Snowball site says anywhere that stemming doesn't
> matter for Japanese. Can you point to where you found this?
>> Does anyone know of any work being done on a Japanese stemmer? I
>> searched around this site, found a reference that said stemming
>> didn't matter for Japanese (err, ah...), but that was about it.
>> I'm not even sure where to go to look for rules on stemming Japanese.
>> Micah Bly
This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:48 BST