Which library to choose for web scraping?

E

evilmolen2016-06-11 13:13:45

PHP

evilmolen, 2016-06-11 13:13:45

Tell me which one to choose for simple work with websites and their further parsing.
Parsing itself is currently carried out through XPath or Simple dom parser, it is only important to receive data, authorization and other emulation of a "real" person.
Required functionality: header settings, cookie settings (saving, manual setting), sending POST requests (for example, for authorization), etc. In general, the flexibility of settings is important. I used to work with a certain " Ultimate Web Scraper Toolkit " but its functionality has ceased to suit me, I want something more perfect.
At the moment I settled on " Guzzle ", while everyone is satisfied, although the speed of work is not the highest. The same phantomjs worked faster, but it doesn't fit.
Can you recommend something based on your own experience?

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

E

evilmolen, 2019-04-25
@evilmolen

I will answer my own old question.
Over the years, I have gone through a number of libraries, I have not found anything better than Symfony DomCrawler Component for myself.
For me, this is the top 1 in terms of speed, memory consumption and expandability.

M

Muhammad, 2016-06-13
@muhammad_97

https://github.com/imangazaliev/didom