Website scraper for GO?

M

Michail Wowtschuk2017-06-05 19:06:33

go

Michail Wowtschuk, 2017-06-05 19:06:33

Please share who parses sites using GO.
In PHP I used CURL+PHP+simplexml_load_string for XML parsing
and CURL+PHP+preg_match_all for page parsing.
But in go I don’t know where to start?
I sort of figured out how to download pages:
url := " lenta.ru "
response, err := http.Get(url)
But what to use for parsing? on some sites, you need to parse XML, and on others, just html pages.
I need to collect data from news
sites in parallel every 5 minutes, about 50 sites, parsing should work very quickly and not heavily load cpu and ram,
because I want to run all processes in parallel.

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

S

screen_sailor, 2017-06-05
@screen_sailor

This list contains Golang libraries related to web scraping and data processing
https://github.com/lorien/awesome-web-scraping/blo...

A

Andrey Burov, 2017-06-05
@BuriK666

https://golang.org/pkg/regexp/
https://golang.org/pkg/encoding/xml/

M

Michail Wowtschuk, 2017-06-05
@wowtschuk

Parser benchmarks in different languages https://habrahabr.ru/post/163979/comments/
https://github.com/seriyps/html-parsers-benchmark