E
E
Ernest Faizullin2017-12-03 18:37:01
data mining
Ernest Faizullin, 2017-12-03 18:37:01

How to extract all dialogues from the text of a book or from the text of a movie script?

Hi. Tell me in general terms how to extract all the dialogues from the text in the question-answer format. For example, I want to train a chatbot based on the book "Young Gentleman's Etiquette". Any programming language
thanks!
UPDATE: Let's say we've extracted all lines that start with a dash. Is there any way to find out the author of the phrase? If possible, then in this way we can extract dialogues involving a specific character. It would be cool. Alternatively, you can first parse all the main characters from the text, and then look for the authors of the dialogue before the initial line by the first occurrence, but this is not accurate.
The main characters can be obtained something like this: they start with a capital letter and are often repeated in the text (the number of frequency will depend on the length of the text), but this condition does not look reliable. We need to get rid of the verbs. Perhaps there are libraries that define the form of speech. I will look for. Or is it better to just create a list of the main characters yourself, so it will probably be most reliable

Answer the question

In order to leave comments, you need to log in

1 answer(s)
D
Dimonchik, 2017-12-03
@dimonchik2013

from the credits
, then you will train on them from books, but not a bot, but a network

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question