D
D
DayOffProgrammer2020-08-03 19:22:12
Python
DayOffProgrammer, 2020-08-03 19:22:12

How to write Python code to parse from a specific page?

Perhaps this is not a difficult question, for those who fumble.

Task: It is
necessary to unload a number of texts from several specific pages, a specific site. For example, let's take: https://market.yandex.ru/product--smartfon-samsung...

You need to upload all reviews with the date, link, rating and author to Excel, or even better, directly to a Google spreadsheet, each in separate cell.
Moreover, this should be done not once, but regularly, and so that new ones do not repeat those already unloaded.

An additional complication is that in order to see the list of required texts, you also need to go through authorization on the site, without captcha, just a pair of login and password and then open the desired page from your account.

How realistic is this in python? And in the direction of which modules to look. I know a little about python, I have been doing it in my free time for a little over a year, before that I did not touch programming.

To everyone who answers - thank you very much.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
S
Sergey Gornostaev, 2020-08-03
@DayOffProgrammer

This is realizum in python. Look towards requests, beautiful soup, selenium and the like.
PS In a year, you can become a programmer who knows a lot about python.

D
datka, 2020-08-04
@datka

BeautifulSoup
Selenium
requests
openpyxl
That's basically what you need. Read the documentation. Write the code.
You can also turn to freelancing.

I
IDzone-x, 2020-08-07
@IDzone-x

Well, to get html from a specific page (or several) the "request" library for scraping html "BeautifulSoup". To register and simulate human robots "Selenium". Well, in fact, further only study and practice :).

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question