Re: [Snowball-discuss] Japanese stemmer?

From: Martin Porter (martin.porter@grapeshot.co.uk)
Date: Fri Jan 26 2007 - 10:04:20 GMT


Micah,

I don't know of particular work in this area, but am broadly aware of
the problems, which are (a) segmentation of text into words and (b) word
normalisation, of which something like stemming forms a part. The place
to go for solutions is no doubt Japan itself. There are commercial
solutions in the West though, with proprietary software from companies
like Inxight and Teragram. Among all the major languages, Japanese
presents the worst problems.

I don't believe the Snowball site says anywhere that stemming doesn't
matter for Japanese. Can you point to where you found this?

Martin

> Does anyone know of any work being done on a Japanese stemmer? I
> searched around this site, found a reference that said stemming
> didn't matter for Japanese (err, ah...), but that was about it.
>
> I'm not even sure where to go to look for rules on stemming Japanese.
>
> Micah Bly



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:48 BST