U
U
Urukhayy2019-02-17 14:57:55
Machine learning
Urukhayy, 2019-02-17 14:57:55

Are there open access tools of computational linguistics for the purpose of analyzing sentences in the Russian language?

Are there open access (in any PL) tools that are able to analyze and transform sentences of the Russian language, in various complex forms, into some kind of model or some sort of ordered algorithm? For example, translation programs somehow analyze sentences and distinguish between the meaning of phrases.
It is desirable that there be a definition of the actions of entities, for example:
"A man gave 50 boxes of sweets to his girlfriend in a bathhouse in the country."
The algorithm must determine that:
- The friend was in the sauna at the time of the gift, and the sauna is in the country house (The bath object is a field of the country house object)
- In candy boxes (Boxes is an array)

Answer the question

In order to leave comments, you need to log in

4 answer(s)
A
Andrew, 2019-02-18
@Urukhayy

There is no ready-made solution, but you may be interested in the following projects:

  • Dostoevsky - Sentiment analysis library
  • Natasha is a library for searching and extracting named entities (Named-entity recognition) from texts in Russian. At the moment, references to persons, dates and amounts of money are being analyzed.
  • Yargy is a Earley parser, that uses russian morphology for facts extraction process, and written in pure python
  • razdel is a library for separating Russian-language text into tokens and sentences. The system is based on rules.

In catch -up
https://github.com/yandex/tomita-parser
SyntaxNet (link to Habr) is a TensorFlow-based library for determining syntactic links that uses a neural network. Currently 40 languages ​​are supported, including Russian.
UPD (03/17/2020):
  • Az.js A NLP library for Russian language
  • isanlp Natural language processing tools for English and Russian (postagging, syntax parsing, SRL, NER, language detection etc.)
  • russiannames Russian names parsers, gender identification and processing tools
  • rulemma Lemmatizer for Russian texts

S
Stalker_RED, 2019-02-17
@Stalker_RED

It looks like you are looking for the so-called "factograph", in the English terminology "Information Extraction".
In Russian, as far as I understand, there are either commercial systems or rather raw student crafts. If you find an opensource project that understands Russian, it will be cool.

M
McBernar, 2019-02-17
@McBernar

Facebook gave away free vectors for many languages.
But this is not suitable for extracting facts - only for comparison and clustering.

D
Developer, 2019-02-17
@samodum

There are no such things in the public domain.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question