C
C
calculator2122021-09-14 16:32:03
go
calculator212, 2021-09-14 16:32:03

Why can't I download a web page with Cyrillic encoding correctly?

In general, such a problem is that I make a regular get request through a proxy (more precisely, through the service api with a proxy), I get a page and save it to a file, but the problem is that Russian characters turn into a mess when opening a file with encoding 1251. Another such feature that when you save the page through the browser, it weighs 60kb, if you download it through go or curl, then it already weighs 70kb, what could be the problem?

func main(){
  rawURL := "url_with_proxy_api"
  url, err := url.Parse(rawURL)
  client := &http.Client{}
  request, err := http.NewRequest("GET", url.String(), nil)

  dump, _ := httputil.DumpRequest(request, false)
  fmt.Println(string(dump))

  response, err := client.Do(request)

  checkError(err)
  fmt.Println("Read ok")

  if response.Status != "200 OK" {
    fmt.Println(response.Status)
    os.Exit(2)
  }
  fmt.Println("Reponse ok")

  var buf [512]byte
  f2,_ := os.OpenFile("test2.html",os.O_WRONLY|os.O_APPEND|os.O_CREATE,0666)
  reader := response.Body
  for {
    n, err := reader.Read(buf[0:])
    if err != nil {
      os.Exit(0)
    }
    
    f2.Write(buf[0:n])
  }

  os.Exit(0)
}

func checkError(err error) {
  if err != nil {
    if err == io.EOF {
      return
    }
    fmt.Println("Fatal error ", err.Error())
    os.Exit(1)
  }
}

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question