T
T
tatarrr952019-08-14 07:21:52
Text Processing Automation
tatarrr95, 2019-08-14 07:21:52

How to recognize the Name and Patronymic from the text?

I am making a voice assistant that asks the user

State your first and last name

It seems to be easy to process, but clients often add their last name or phrase to the answer, for example
Ivanov Ivan Ivanovich
My name is Ivanov Ivan Ivanovich
or call the name informally, for example
Ivanov Vanya

I need to understand that there is a first name and a middle name in the phrase and pull them out accordingly. Accordingly, I am already translating speech into text and I need to figure out how to work with the text. I thought to do it through dialogflow, but as I understand it, he is bad with Russian names, and he does not know how to understand middle names at all. For regular expressions, it's hard to find a dictionary of names with their colloquial versions.

Answer the question

In order to leave comments, you need to log in

6 answer(s)
D
d-stream, 2019-08-14
@d-stream

100% reliable - nothing. Especially taking into account the linguistic / cultural nuances when there is no patronymic, the surname is composite, the patronymic consists of a chain of the family line, etc.

D
Developer, 2019-08-14
@samodum

I have such a service, but it is for internal use.
I'll finish the external API, then I'll share it. Until then, I can't.
Send test names in the comments here, I will show the answers, I will debug.
5d546026a0c7c215129656.png

A
Adamos, 2019-08-14
@Adamos

It is possible specifically for your case to look for patterns like the fact that no one will pronounce the middle name, if any, except after the first name. And among Russians, it is usually determined by the ending -vich / -vna. Well, in the name base you determine what the name can be. And if it didn’t work out, then maybe just ask again? You have the same interactive ;)
And outside of Russian traditions, the logic of name-patronymic is generally inapplicable.
Richard Matthew Stallman will not admit to you in life that he is actually Danilych.
And what kind of basin will any Croatian team cover your attempts ...

K
Kirill Gorelov, 2019-08-14
@Kirill-Gorelov

Or you can do it as an option.
We collect a list of all the names and surnames that you can find on the sites, put them in the database.
Then you take your text and look for words from the text to your database. If found, then this is the name or surname. At the expense of patronymics, you can do the same, but it seems rather difficult to me.
There is a problem that there will be a lot of words in your text and there will be a lot of queries to the database, but this problem can also be solved here.

A
Andrey Andreev, 2019-08-14
@b0nn1e

We open the wiki
. We look at patronymic suffixes.
We divide the whole expression into words.
Words that end in the necessary suffixes - patronymic.
The word before is the name.
98% working option)

K
Konstantin Tsvetkov, 2019-08-14
@tsklab

Impossible. There are many surnames that match given names. For example, the table of musicians (30000) Performer_1.FirstName = Performer_2.LastName:

James	James	4200
Martin	Martin	3800
John	John	3136
Paul	Paul	2442
Thomas	Thomas	2340
Scott	Scott	2183
David	David	1918
Michael	Michael	1712
Lee	Lee	1400
George	George	1284

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question