I
I
Ivan2021-04-26 12:38:46
go
Ivan, 2021-04-26 12:38:46

Can there be something like protection against parsing on the site?

You need to parse the site in Golang. I do everything according to the old scheme, but for some reason after http.Getthat all the content of the site is not there, in particular what I need. I am writing to a file to study the structure. On the site, the content opens immediately by clicking on the link, without pressing any buttons. Everything is in the inspector in the browser. I tried to make a selection by selectors - it does not take. What could be the matter and how to deal with it?

res, err := http.Get(Url)
  if err != nil {
    log15.Error("getting response body with error", log15.Ctx{
      "url": Url,
      "err": err,
    })
    return
  }
  defer res.Body.Close()

  // Create output file
  outFile, err := os.Create("res.html")
  if err != nil {
  log.Fatal(err)
  }
  defer outFile.Close()
  
  // Copy data from HTTP response to file
  _, err = io.Copy(outFile, res.Body)
  if err != nil {
  log.Fatal(err)
  }

Answer the question

In order to leave comments, you need to log in

2 answer(s)
V
vgrabkowot, 2021-04-30
@vgrabkowot

Instead of using http.Get use Chrome DevTools Protocol https://github.com/chromedp/chromedp

T
ttlscr, 2021-05-20
@ttlscr

Try to disable JS in your browser and follow the link ¯\_(ツ)_/¯

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question