[Snowball-discuss] Mobile phone implementation of the English Stemmer

From: Alexandra Elizabeth Duncan (aed02@doc.ic.ac.uk)
Date: Thu Aug 28 2003 - 17:55:01 BST


Hi
I wrote a couple of emails to this mailing list back in July - I am an
MSc student studying Computer Science at Imperial College, London. I
have just about completed my thesis/project which has been concerned
with writing a mobile phone translator (english to french and french to
english). Please excuse the length of this email but I thought you
might be interested in the work I have done using the Porter algorithm.

Very briefly, there is a small dictionary of words stored as part of the
application on the mobile phone. A user inputs a word to be translated
and the application returns the translation if the word is found in the
phone dictionary. If the word is not in the dictionary, the application
queries a remote dictionary and returns the translation.

Given the constrained system requirements of mobile phones, I have had
to work at compressing the words to be stored on the phone. For this I
used the Porter algorithm and the Java implementation from the website.
The words that make up the dictionary are stemmed and stored on the
phone. When the user inputs a word, that word is then stemmed (using
the Java implementation modified slightly for the mobile phone) and then
matched against the stemmed words in the dictionary.
By doing this, I was able to get about 25% compression on the english
words I had.

I only implement the stemming for the english words and therefore only
the english words are compressed. I did try to implement the french
stemmer but I found it was too large for the mobile phone and more
complicated.

I would like to say thank you for the excellent and informative website
- it has been of great use to me in the past 3 months.

I was also wondering if you know of anyone who has implemented the
stemmer on a mobile phone. If not, this would lend my project a bit of
extra kudos, I have to say!

I will be finalising the code and writing the actual thesis in the next
2 weeks. If anyone is interested in the work that I have done on it,
please let me know as I would be more than happy to supply the code
and/or the report.

Thank you once again
Alex Duncan



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:45 BST