Re: [Snowball-discuss] Stemming 'communing' and 'communed'

From: Martin Porter (martin.porter@grapeshot.co.uk)
Date: Thu Mar 29 2007 - 12:39:46 BST


You are right: the spec is not clear on this, and I will have to alter
it (I'll try and do so in the next few days). One way of looking at it
is that in commun- gener-, the first vowel is treated as a consonant,
and then cXmmun gXner become short words, and another way of looking at
it is to say that a short word is to be defined as a something ending
with a short syllable entirely outside R1.

So perhaps R1, R2 should be defined first, then shortness in terms of
R1.

(Although the spec makes no use of R1 in defining 'short', the snowball
script uses R1 to determine whether something is short, so there is a
connection.)

The problem of course arose because these exceptions were added as an
afterthought.

Thanks for pointing this out,

Martin



This archive was generated by hypermail 2.1.3 : Thu Sep 20 2007 - 12:02:49 BST