Answer the question
In order to leave comments, you need to log in
Parsing the site (its content) from the web archive. How?
Good day to all!
Actually, the question is right in the title. What is the best way to pull content (or the site itself) from the web archive today?
Perhaps someone has experience, share the buns.
Thanks in advance.
ps. maybe there is some python library for this case. It would be even better.
Answer the question
In order to leave comments, you need to log in
The Wayback Machine Downloader is called a contraption - if you copy everything, and if you parse, that is, take it apart, then there are a lot of options, for example lxml (it seems to be used inside BeautifulSoup and Scrapy).
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question