J
J
Jo Jo2019-09-19 08:57:19
Parsing
Jo Jo, 2019-09-19 08:57:19

Fast server response code parser for 1 million sites. Have you chosen PHP so far?

Good day. The question is, I'm going to make a simple but fast parser of server response headers and parsing of main pages for about 1 million sites. So far, the choice has fallen on PHP, since there is a convenient CURL library for it (for parsing the main pages). But the confusing thing is that the PHP script runs into the limitations of nginx pretty quickly. You will have to continue parsing in several iterations and create an additional load on the database. Yes, and PHP is not designed for such long tasks inherently.
Ideally, parse every day. Do you think it's worth trying something else, or is PHP the right choice?

Answer the question

In order to leave comments, you need to log in

5 answer(s)
A
ArgosX, 2019-09-19
@ArgosX

For this case, consider python,nodejs,go

V
Vitaly, 2019-09-19
@vshvydky

for me, this task is much better under node.js

D
Dmitry Shitskov, 2019-09-19
@Zarom

If about speed, then I would choose golang. There tasks are easy to asynchrony and parallelize.
In second place there will be a node, but using some kind of
µWebSockets.

R
Randewoo, 2019-09-19
@Randewoo

Google about asynchronous cURL in PHP, and if possible, run PHP through the console. When launched through the console, there is no timeout limit, respectively, there will be no load, I did it myself on a very weak VDS.

A
Arthur, 2019-09-19
@ar2rsoft

To parse, you can run it from the console, what are the limitations of nginx then?
But in general, I join those who vote for go, python

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question