C
C
Cavez2021-12-02 19:53:12
go
Cavez, 2021-12-02 19:53:12

Parsing downloaded sites?

Good evening everyone! We have a certain number of saved sites, in the form of folders with all the resources: react js chunks, images, html files, and so on. You need to somehow open them first and pull out the necessary data. If the second is more or less clear, then I have serious questions about the first.
How can you extract useful information from such a data format? There was an idea for each folder with a site to run this site through the server, and already through some JSOUP to get what you need from there

Answer the question

In order to leave comments, you need to log in

1 answer(s)
D
Dmitry Sviridov, 2021-12-03
@Cavez

To accomplish this task, you will need to run a go-server for each site. You need to make it so that you can open index.html in the browser and connect JS files, after which you need to connect to it using Selenium (so that JS is executed) and get data from there. Most likely, this library will come in handy. Perhaps, by the way, you don’t need to serve the files, and you can somehow slip them into Selenium right away - you have to actually look there.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question