Answer the question
In order to leave comments, you need to log in
How to get a list of words frequently found in a text?
There is a text. You need to parse it and display a list of words that occur in it and the number of their occurrences. At the same time, a "smart" search is needed, which would take into account word forms and in the result they were written in the infinitive.
Can you tell me what this procedure is called? Or what is the library for this? Language is not important, but PHP/Python/Java/Scala are preferred
Answer the question
In order to leave comments, you need to log in
Have a look at Sphinx ( sphinxsearch.com/).
The procedure for bringing a word form to a normal form is called normalization (morphological task). AOT (aot.ru) can also handle it well. For a GOOD search, you need to use engines (Sphinx and others). Sphinx returns statistics by words in the results.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question