I'm sure the reason it was not done is that the group is so small. What you
would certainly need to do is to check for the ending -los, and not remove
the -s in that case. If you take the sample vocabulary provided with German,
you then get the following residual list,
ambros amos autos bartholomaios büros chaos credos fotos
haemorrheos heros hos infos jethros jos lebensmittelembargos
migros moos mythos pharaos platos salomos studios theophrastos
25 words in all. -s could be removed with benefit or without harm from all,
or almost all, of these words. There is some overlap here with your own word
Thank you for pointing this out. I will review the German algorithm at some
point in the future, and possibly incorporate your sugestion,
>I'm wondering if there is a good reason for the German stemmer not to
>suffix strip the s in words ending on 'os'.
>Autos, kinos, echos, bu"ros, silos, pianos, et.c.
>Here are some words you can consider.
>Albatros, apropos, chaos, epos, kosmos, gros, rigoros, grandios, los, haarlos.
>All I can think of will be pretty much ok suffix stripped.
This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:47 BST