V
V
Vitya Podpriklopolny2018-10-13 10:48:47
PHP
Vitya Podpriklopolny, 2018-10-13 10:48:47

The encoding flies when parsing + the regular expression catches one result. Why?

Good afternoon. I decided to parse Eldorado, and in particular the prices of laptops. To begin with, I want to display all the prices of laptops, and then the products themselves (display only their picture, name, price, operating system)
There are several problems:
1. The site encoding flies
Code:

<?php

function get_content($url) {
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  $res = curl_exec($ch);
  curl_close($ch);
  return iconv('UTF-8', 'Windows-1251//TRANSLIT//IGNORE', $res);
}

$file = get_content('https://www.eldorado.ru/cat/4005/');
//echo $file;
$pattern = '#<span class="discountPrice itemPrice">.+?<span class="rub">.+?</span></span>#s';
preg_match($pattern, $file, $matches);
print_r($matches);

?>

I am using cURL. When content is copied to $file, the encoding is lost. I decided to try using iconv, but the Russian characters just disappeared. How to set the encoding so that the site is displayed as in the original?
2. The regular expression finds only one result
Prices in html are completed like this:
<span class="discountPrice itemPrice">29&nbsp;990<span class="rub"> .</span></span>

There are about 20 of them on the page. I use the following expression to search:
#<span class="discountPrice itemPrice">.+?<span class="rub">.+?</span></span>#s
, and for some reason throws only one result.
3. Parsing on only one page
Let's imagine that my expression works and all 20 requests on the page are successfully parsed and displayed in the $matches array. But how do I parse from the entire category from all pages? Should I not call parsing on each page in turn? Thanks a lot
in advance !

Answer the question

In order to leave comments, you need to log in

1 answer(s)
D
Dimonchik, 2018-10-13
@dimonchik2013

1) the meaning of this action? 'UTF-8 -> 'Windows-1251 megafrugal WPS, no room for bytes? do everything in utf8, including the base
2) why not php.net/manual/en/function.preg-match-all.php
3) imagine as if by a browser - you download everything, parse, select

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question