X
X
xmoonlight2017-09-26 23:18:18
Algorithms
xmoonlight, 2017-09-26 23:18:18

Hashing a word with a tolerance for typing and/or spelling errors. How to do?

Hello.
For example, the task is to hash the word: " space "
so that for any 2 errors in the spelling of the word (incorrect 1-2 letters and/or 1-2 missing letters) so that the hash does not change and remains constant.
Anyone have any ideas on how to do this? => hash:kgtsbrjnpmdl
Thanks in advance for all your help.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
W
Wataru, 2017-09-27
@wataru

Here is an example of such a hash:

int hash(string s) {
  return 42;
}

You can return another number instead of 42, but always, always the same.
This is because many misspelled words overlap. For example, the strings "aaaa" and "aabb" should give the same hash. But in the same way, the terms "bbbb" and "aabb" should give the same hash. As a result, it turns out that all possible strings should give the same hash.
What is the original task? Why do you need such a hash? Surely something like searching for lines that matched 1-2 errors. In this case, it is necessary to generate by enumeration from the given string all possible ones with 1-2 errors, these strings are already stored somehow (for example, using a standard hash in a hash table).
Or you can compare strings in pairs, counting how many errors are needed to get another from one string. This is standard dynamic programming. Google edit distance.

�
âš¡ Kotobotov âš¡, 2017-09-27
@angrySCV

perhaps the graphs are better suited, then if you have an error, you can try other options for one of the nodes (letters).

V
Vladimir Olohtonov, 2017-09-28
@sgjurano

What you are looking for is called "fuzzy search".
Here is a good article on this topic:
https://m.habrahabr.ru/post/114997/

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question