S
S
skim17762012-12-02 15:04:36
Algorithms
skim1776, 2012-12-02 15:04:36

Lemmatization algorithm?

Is there a lemmatization algorithm (albeit simplified and inaccurate) for the Russian language or a library for the JVM?

Answer the question

In order to leave comments, you need to log in

3 answer(s)
B
becks, 2012-12-02
@becks

If stemming does not suit you and you need to be more precise, look at AOT (aot.ru) there are already implemented lemmatization algorithms, open codes and the theory is described.

M
MrMig, 2012-12-02
@MrMig

You may be satisfied with the Russian morphology analyzer for Lucene: code.google.com/p/russianmorphology/
Lucene itself is not required.

//опустим шаблонный код

private LuceneMorphology luceneMorphRus;
private String str = "Красивая"; 

// это лучше обернуть в синглтон, операция дорогая!
luceneMorphRus = ResourceLoader.getLuceneRussianMorphology();
List<String> wordInfo = luceneMorphR.getMorphInfo(str);

//анализируем wordInfo

B
becks, 2012-12-02
@becks

So like Porter's stemmer.
Here is an example on a javka:
www.algorithmist.ru/2010/12/porter-stemmer-russian.html

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question