I
I
Ivan2017-02-05 05:05:46
Programming
Ivan, 2017-02-05 05:05:46

How is Reading Mode Implemented in Browsers?

Many modern browsers have a so-called "reading mode" that leaves only the text of the article with pictures on the page, removing all other formatting, advertising and unnecessary site blocks.
How is this feature implemented in software?
What algorithm can be used to get the text of the article from the web page without unnecessary blocks just as competently?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
X
xmoonlight, 2017-02-05
@iwqn

Parsing the DOM tree and identifying the block containing the main content.
Simple filtering by the amount of text in one unique block without repetitions (lists) at the current level of the DOM tree.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question