Lemma versus lexeme[change | change source]
A lemma is the word you find in the dictionary. A lexeme is a unit of meaning, and can be more than one word. A lexeme is the set of all forms that have the same meaning, while lemma refers to the particular form that is chosen by convention to represent the lexeme.
Morphology[change | change source]
In English, the lemma of a noun is the singular: e.g., mouse rather than mice. In languages with gender, the head word of regular adjectives and nouns is usually the masculine singular. If the language also has cases, the lemma is often the masculine singular nominative.
Difference between stem and lemma[change | change source]
In computational linguistics, a stem is the part of the word that never changes even when different forms of the word are used. A lemma is the base form of the verb. For example, from "produced", the lemma is "produce", but the stem is "produc-". This is because there are words such as production. When sound (phonology) is taken into account, the definition of the unchangeable part of the word is not so useful. Notice the sound of the words in the example: "produced" // versus "production" //.
Some lexemes have several stems but one lemma. For instance "to go" (the lemma) has the stems "go" and "went". Here, the past tense is based on a different verb, "to wend". The "-t" suffix is equivalent to "-ed".
References[change | change source]
- Nation, Paul & Waring, Robin 1997. Vocabulary size, text coverage and word lists. In Schmitt, Norbert & McCarthy (eds) Vocabulary: description, acquisition and pedagogy. Cambridge University Press, p9. ISBN 978-0-521-58551-4
- "Natural Language Toolkit — NLTK 3.4 documentation". www.nltk.org.