Copy site from wget or httrack downloading scripts from external sites?

D

Dmitry Sokolov2021-01-22 01:06:59

Wget

Dmitry Sokolov, 2021-01-22 01:06:59

Hello!

I want to download a site like:

<!DOCTYPE HTML>
<html lang="ru">
<head>
....
<link rel="stylesheet" href="//google.com/css/style.min.css">
<script src="https://yandex.ru/js/script.min.js"></script>
....
</head>
<body>
....
<a href="/catalog-2/">Другая страница данного сайта</a>
....
</body>

I need that there is no recursive download, so that it does not follow the links of this site (does not download "catalog-2"), BUT it downloads all the files of styles, fonts, pictures, etc. from external sites (like google and yandex styles and scripts in the example above).

I re-read all the documentation, tried all the parameters, but nothing comes out.

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

G

ge, 2021-01-24
@gedev

Alternatively, you can parse (using grep / sed / awk ) all external links on the page (after downloading the page, for example, via curl ) and then go through these links with wget in a loop. But it already pulls on the whole script.