Answer the question
In order to leave comments, you need to log in
Lemmatization algorithm?
Is there a lemmatization algorithm (albeit simplified and inaccurate) for the Russian language or a library for the JVM?
Answer the question
In order to leave comments, you need to log in
If stemming does not suit you and you need to be more precise, look at AOT (aot.ru) there are already implemented lemmatization algorithms, open codes and the theory is described.
You may be satisfied with the Russian morphology analyzer for Lucene: code.google.com/p/russianmorphology/
Lucene itself is not required.
//опустим шаблонный код
private LuceneMorphology luceneMorphRus;
private String str = "Красивая";
// это лучше обернуть в синглтон, операция дорогая!
luceneMorphRus = ResourceLoader.getLuceneRussianMorphology();
List<String> wordInfo = luceneMorphR.getMorphInfo(str);
//анализируем wordInfo
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question