S
S
Space2014-03-28 09:46:57
Books
Space, 2014-03-28 09:46:57

Is there an algorithm for reducing text?

Hello. Is there any algorithm to shrink the text i.e. so that extra words: mataphors, repetitions, epithets are removed, for example, and the meaning of the text remains clear?

Answer the question

In order to leave comments, you need to log in

3 answer(s)
L
lightcaster, 2014-03-28
@ruslite

It is called automatic text summarization
There are open source programs. The truth is bad, and for English :).
If you want to make your own, you need to define a certain criterion, for example, entropy or perplexity, and try to minimize it. Or create matrices and try to reconstruct low-rank with minimal loss.

I
ixon, 2014-03-28
@ixon

I remember there was such a post on Habré, where one journalist talked about such an algorithm. It included all sorts of words that, if the meaning is removed, will not change, the algorithm highlighted these words and showed it to him so that he could remove these words in some places and shorten the text.

A
afiskon, 2014-03-29
@afiskon

The easiest way is to make a large list of regular expressions to replace or remove parts of the text. In s/ style, like //g. s/So //g and so on.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question