E
E
Evgeniy Kornyshev2019-07-30 22:48:28
Java
Evgeniy Kornyshev, 2019-07-30 22:48:28

How to parse HTML in Java using HtmlUnit or JSOUP?

Hello. There was the following problem with parsing sites: the get method in JSOUP and the corresponding mechanism in HtmlUnit return the source code of the page. But the necessary text content that I see in the browser is wired into the source code, but I don’t know how to extract it from there. Is it possible to get the final HTML page with all text content using Java tools, or is it all in a readable form? Thanks in advance, I hope I wrote clearly.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
Sergey c0re, 2019-07-31
@Kornyshev

I think you need "headless" chrome, see Introduction to Headless Chrome

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question