Answer the question
In order to leave comments, you need to log in
Jsoup does not parse the necessary information, what is the reason for this?
For several days I have been suffering with the fact that jsoup does not parse the necessary information for me from this article https://zen.yandex.ru/media/id/5a9d345c1aa80c262cd...
I need to display only the number of viewers, in the browser in the developer console this data is , and when you try to parse the entire content of the site, he does not see this data
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import java.net.URL;
public class Parse {
private static Document getPage() throws IOException{
String url = "https://zen.yandex.ru/media/id/5aabde78168a9112996a70a8/pishem-pervuiu-stroku-koda-na-javascript-5aabdfb8a815f13d161aaa67";
Document page = Jsoup.connect(url).maxBodySize(0).userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6").timeout(0).get();
return page;
}
public static void main(String[] args) throws IOException{
Document page = getPage();
Element views_all = page.select("span[class=article-stat__count]").first();
System.out.println(views_all);
}
}
Answer the question
In order to leave comments, you need to log in
This question is asked on average once a week. But it is enough to look at the source code of the page to understand the answer. Element
<div class="article-stat__info article-stat__info_loaded">
and all of its children, including the number of views, are generated by javascript after the page is loaded. Initially, this data is not in html. Since Jsoup works with the original data received from the server, and javascript does not, it will not be able to see article-stat__count.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question