Answer the question
In order to leave comments, you need to log in
Need help parsing a WordPress site?
There is a site, you need to parse the photo and title from each post, from the first to the last page. What frameworks will be needed? Is it possible to get by with just jsoup? Are there any resources where you can find an approximate algorithm for going through articles and pages?
Answer the question
In order to leave comments, you need to log in
Hello!
1) Do you need authorization on the site to access the content? Read how to log in to a site using jsoup.
2) It doesn't matter what CMS you are parsing.. WP or something else
3) Jsoup doesn't know how to work with dynamic content (for example, ajax pagination, scroll loading, etc.). Usually, if there is no dynamic content, then this is enough.
4) If there is still dynamic content - look towards Selenium + browser (FF || Chrome, etc.)
5)
Are there any resources where you can find an approximate algorithm for going through articles and pages?
do {} while () или while() {}
collect information (links) about existing records and add to some List. Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question