M
M
Mark2020-06-22 14:25:28
PHP
Mark, 2020-06-22 14:25:28

How to download sites with minimal visual damage using PHP?

It is necessary to be able to download sites with minimal loss of appearance ( * ), while full-fledged functioning is not very important.

The question does not concern any specific sites, but any. It is clear that it is difficult to organize a more or less full-fledged download of any site, but I would like to at least increase the percentage of those.

Question:
1. If there are solutions that already solve the problems of appearance and quality pumping?
A typical example: if src contains a relative path, load the domain.
2. Are there things that you should immediately pay attention to when developing such a solution?

Important criterion: it is necessary that the solution be "single file". That is, it did not download additional CSS, JS to the server. For example, to substitute links from the donor site, or add code directly to the body of the document (example: detect the CSS connection => Download the code => Place it in the body of the document between => Erase the connection.

PS
The base tag used to seriously help in this matter, but now it is not possible to use it

* - in order to load the files that are needed to display the HTML page. Example: images, styles.
That is, so that there is not a big visual difference from the original (it was ~ 70-80% similar)

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Andrey Gavrilov, 2020-06-22
@thexaver

Wget

A
Adamos, 2020-06-22
@Adamos

If the site actively uses ajax or is generally built on React/Vue/etc, you'll "pump out" its content without emulating a browser.
However, questions with the terminology of "loss of appearance" are for augurs.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question