S
S
SAMzz2019-03-01 03:26:24
Automation
SAMzz, 2019-03-01 03:26:24

How to do automatic grabbing of site pages with saving information to a table or database?

Tell me how and how you can save information from the pages of the site (text, links)?
What is the problem ?
through the browser everything opens (preferably IE 10.11)
part of the site with a password and because of this, when you try to download different programs with site savers, nothing comes out. The initial page is downloaded instead of the necessary ones.
there are a lot of pages - manually save or copy - for a very long time
it is desirable to put everything on the shelves - let's say in the form of an Excel table or a csv file with tabs - each block of information in its own cell / column. new page to new line
the address of the pages differs by one digit (i.e., you need to enumerate numbers with auto-substitution of digits in the range from and to - so that you can specify or download from a txt file, for example, where the links and the program will be, the script will sort them out)
For example: the
page is loaded - the information is saved, +1 to the page number in the address - the next page opens, the information is saved, the next page +1 to the number, and so on,
or do it in 2 stages?
First, download HTML pages to your computer (did not succeed yet because of the password / login)
How can you automate the process of saving such pages, for example, to a computer?
and then how to parse the HTML code to pull out the necessary data?
or you can immediately "on the fly" read data from the page (via a link) into an Excel spreadsheet, some kind of database, a file,
you can simply read information from the screen and save it without intermediate manipulations of saving to HTML + processing

Answer the question

In order to leave comments, you need to log in

2 answer(s)
S
Stalker_RED, 2019-03-01
@Stalker_RED

Usually these "site saver programs" have an authorization mechanism.
You can parse on the fly, you can first download everything, then parse. There is no particular difference.

S
SAMzz, 2019-03-01
@SAMzz

the fact of the matter is that in different rocking chairs the result is the same - the authorization page is downloaded
and the necessary pages are not
there, something does not give
any ideas?
Or maybe go the other way? there are options from the browser to quickly sort through the pages +1 to the page number and save to the local disk?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question