P
P
p4p2016-10-24 12:35:12
C++ / C#
p4p, 2016-10-24 12:35:12

How to compare strings in c# to get comparison result in percentage?

There is let's say the text
Talk to me about trifles, Talk to me about
eternity . Let, like a child , on your hands Lie flowers born in the spring.


It is necessary to compare the string "talk to me like a child" and find out the percentage of matches. That is, you need to understand whether the given string is a "text description". What's the best way to compare like this?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
M
Mercury13, 2016-10-24
@p4p

So this line will be a description or not? We think it will.
1. We beat lines into words.
2. Perhaps throw out meaningless words and convert everything to a basic form. I must admit, it’s a difficult task: if there is no dictionary, then “the general’s daughter” is a noun + verb or noun + noun? And with a dictionary: is “already” a particle or a noun? In general, it is better to give out all possible bases, and we get a non-transitive "=" - OK if at least one form matches. Perhaps we are doing yofification or defikation: the first is more difficult, the second is fraught with false positives.
3. We calculate the Levenshtein distance between these arrays, considering not letters, but words, as indivisible entities.
4. Convert it to our percentage — for example, % = d / max{|s1|, |s2|}.
You can continue: if the match is not only in the basis, but also in the form - additional points. Work with synonyms, both from the dictionary and manual, for example, the notorious Milfgard came up with the game "Jackal", and manually, according to the results of the study of logs, entered the synonym "Coyote". Do you need to work with spelling errors? Including misspelling a word like "genie/genie"? Both of them are found in a bottle, but the first is drunk, the second grants wishes. :)
In general, the task is research, and an important part of it is to understand where to stop and what is enough to solve a problem of a higher level.

M
Mikhail, 2016-10-24
@dmitrievMV

get a collection of words from the stock (including declensions?) and compare the number of matches

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question