P
P
Pivacik2018-10-21 20:44:55
Parsing
Pivacik, 2018-10-21 20:44:55

How to parse html pages?

There is a site where you need to log in before viewing.
Then there is a list of products available at a link like http://.....ru/product/id/71680
The bottom line is that some of the information is hidden in the browser window, but you can go to the developer panel (I tried it in chrome) and in the sources tab save this page as 71680.html and open it locally, all information will be available.
Can you tell me how to automate this process?
UPD: Now I checked, it is not necessary to enter the developer panel, it is enough just to save the document not completely, but only html.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
O
Oleg, 2018-10-21
@politon

If information is visible in the PR, then it is most likely hidden by js.
When parsing, you get the entire page, including hidden content.

A
Aleksey Solovyev, 2018-10-21
@alsolovyev

How can I see hidden information on the site? Some js code (method), which, after clicking, shows information.
There are two solutions:
1. Understand how the method works and write your own, which will call it
2. Simulate a click
For an example, how a click works on selenium:

driver = webdriver.Firefox()
driver.get("http://www.google.ca")
element = driver.find_element_by_link_text("Gmail")
element.click()

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question