O
O
Orbb2017-07-03 20:49:19
PHP
Orbb, 2017-07-03 20:49:19

How to get the entire sitemap with the response code?

Greetings,
I have a site on Bitrix, it contains about 100 tons. pages.
Here the optimizer decided that the sitemap should contain only pages with a server response of 200. 301, 302, 404 and others are not needed here.
I wrote an agent, tested it on an infoblock with hundreds of entries, everything worked perfectly. But as soon as I made a selection of ~5000, the script did not work very well. In principle, it is logical - 0.2s per request to the URL via cURL, just about an hour and it comes out.
Tell me, how would you solve such a problem?
I now have an idea to parse this functionality into 3 scripts, one makes a selection of urls 1 infoblock at a time, the second cURL runs through the array through sleep, also one infoblock at a time, the third then combines all this into one .xml. I don't know how much faster it will be.
Or maybe you don’t need to run a curl at all ...?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
D
Dimonchik, 2017-07-03
@dimonchik2013

in general, it is decided from the side of the site, through the router, i.e.
but if you decide outwardly - there is no way but to kill
or leave it as it is and clean the logs already

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question