F
F
fr1zzer2017-03-04 20:09:06
PHP
fr1zzer, 2017-03-04 20:09:06

How to add pages to parsing?

There is a code:

<html>
<head></head>
<body>
<?php
function browser($url) {
$url="https://site.com/page?p=1";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)");
$html = curl_exec($ch);
curl_close($ch);
return $html;
}
preg_match_all('~<a class="qa_title_link" href="(.*?)">~is', browser($url), $text);
print implode('<br />', array_slice($text[1], 0, 20)); 
?>
</body></html>

How to make
$url="https://site.com/page?p=1";
?p=1 become ?p=2 and so on and parse from all pages and display a common print of all pages.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
4
4iloveg, 2017-03-05
@4iloveg

Remove $url=" https://site.com/page?p=1 "; from the function
and then

for($i = 1;$i<=10;$i++){
$results[] = browser("https://site.com/page?p=$i");
}
// далее пройди по элементам массива через foreach() и достань ссылки

but better see how to use multicurl

P
Pavel, 2017-03-13
@PavelFokeev

<html>
<head></head>
<body>
<?php
function browser($url) {
// $url="https://site.com/page?p=1";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)");
$html = curl_exec($ch);
curl_close($ch);
return $html;
}
$all_pages = array();
for($i = 1;$i<=10;$i++){
  preg_match_all('~<a class="qa_title_link" href="(.*?)">~is', browser("https://site.com/page?p=$i"), $text);
  $all_pages = array_merge($all_pages, $text[1]);
  print implode('<br />', array_slice($text[1], 0, 20)); 
}
print implode("<br />", $all_pages);
?>
</body></html>

print implode('<br />', array_slice($text[1], 0, 20));
will display the result of parsing one page in each iteration of the loop
print implode("<br />", $all_pages);will display the combined array, after parsing all pages

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question