A
A
Andrey Ryabov2017-10-20 15:19:48
Java
Andrey Ryabov, 2017-10-20 15:19:48

Text processing in Java?

There is a table with workers in which the full name is stored. What are the ways to select the right employee, if there are typos in the full name, maybe there is some similar profile?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
L
longclaps, 2017-10-20
@Vope

Levenshtein distance + google .

A
al_gon, 2017-10-20
@al_gon

Everything depends on the number of employees.
If fit in memory, then try java-string-similarity aka search engine.
In general, all metrics are based on a 1:1 comparison. So if you have 1K workers, that's 1K checks.
For speed, a search index is needed, it does not have to be a full-fledged search engine, but the principles of Inverted_index must implement (Example: Inverted_index#Java ).
Since you are looking not by words, but by words, you need a more accurate unit than a word, namely N-gram .

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question