package
0.6.2
Repository: https://github.com/inteli-ai-dev/snowball.git
Documentation: pkg.go.dev

# README

Snowball Russian

This package implements the Russian language Snowball stemmer.

Russian overview

Russian has 33 letters, 11 Vowels, 20 consonants and 2 unpronounced signs. The capital letters look the same as the lower case letters, with the exception of cursive capital letter and lower case.

Implementation

The Russian language stemmer comprises preprocessing, a number of steps. Each of these is defined in a separate file in this package. All of the steps operate on a SnowballWord from the snowballword package and modify the word in place.

Caveats

The example vocabulary for the original Russian snowball stemmer contains the word "злейший", which means "worst" in English. This word contains the adjectival suffix "ий" preceded by the superlative suffix "ейш". The output for the example vocabulary indicates that this word should be stemmed to "злейш". However, this implementation stems the word to "зл". The Python NLTK implementation also stems "злейший" to "зл". It is unclear to me how the original snowball implementation would possibly produce "злейш". So, I removed that word from the tests.