L
L
lucifer_jr2020-10-02 17:49:11
go
lucifer_jr, 2020-10-02 17:49:11

How to find the most frequently occurring word without taking into account word forms?

Assumption: there is a word "deer", I look if it is in the map (by getting the keys and searching for a substring in the string. If the number of matched characters> 3, for example, then this is the same word), if there is, then increment the counter by this key, if not, then add to map. Then I move on to the next word.

But here a problem arises, how to determine that if > 3 characters of a substring matched, then this is the same word. After all, it's not a fact. How to be?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
A
alfss, 2020-10-03
@lucifer_jr

https://github.com/kljensen/snowball
to you in this direction

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question