A
A
ademar262016-08-05 13:06:48
PHP
ademar26, 2016-08-05 13:06:48

Why doesn't parsing work correctly?

There is a piece of code: Array Loop

foreach (($table_header->children) as $num => $th_row)
does not work correctly. Need help finding the error. $numer['model']instead of the article, the name of the product is assigned.
private function get_price($subcat){
        $sql_file = fopen('base.sql', 'a+');

        $path_to_images = __DIR__ . _DS_ ."catalog". _DS_ ;
        $manufacturer_id = 25;
        //Разбираем подкатегории на отдельные группы товаров
        //$table_of_goods - таблица товаров общая
        //$table_header, $table_body - заголовок и тело таблицы соответственно
        $table_of_goods = new simple_html_dom();

        $html_sub = file_get_html($this->link . $subcat);
       // $html_sub = str_get_html($html_sub);
        //var_dump(get_class($html_sub));
        $description = new simple_html_dom();
        //Получаем описание, полное для всех товаров (кроме характеристик)
        $description = $html_sub->find('dl.tabs')[0]->find('dd.selected')[0]->find('div.tab-content')[0];

        //$description = $html_sub->find('dl.tabs dd.selected div.tab-content')[0];
        $description = htmlspecialchars($description->plaintext);



        //echo "<HR>";

        $table_of_goods = $html_sub->find('div#modal_table table.mod_t')[0]->children;

        $table_header = $table_of_goods[0]->find('tr')[0];
        $table_body = $table_of_goods[1]->children;
        $numer = array();
        //Вытягиваем номера ячеек с необходимыми нам столбцами
        foreach (($table_header->children) as $num => $th_row) {
            //  echo "FIRST foreach<br>";
            //var_dump($th_row->plaintext);
            //var_dump($num);
            switch(trim($th_row->plaintext)){
                case 'Наименование':
               // case 'Наименование светильника':
                    $numer["name"] = $num;
                    break;
                case 'Фото':
                case 'Изображение':
                    $numer['image'] = $num;
                    break;
                case 'Артикул':
                    $numer['model'] = $num;
                    break;
            }

        }
        if (!(isset($numer['image']))){
            $numer['image'] = '-1';
            $image_path = $html_sub->find('div.product_info div.image div img')[0]->src;
        }
        //echo "<hr>";
        $rowspan = 1;
        //Вытягиваем изображение и информацию из ячеек данных

        foreach ($table_body as $table_line){
            // echo "second foreach<br>";
            //прогоняемся по всем элементам таблицы, вытаскиваем подкатегории 3 уровня и товары с их свойствами и картинками
            //var_dump($table_line->plaintext);
            $arr_of_tl = $table_line->children;
            if($rowspan == 1){
                $row_bool = 0;
            }
            else {
                $row_bool = 1;
                $rowspan--;
            }
            foreach ($arr_of_tl as $num => $value){
                //Сопоставляем ячейку с заголовком
                switch($num + $row_bool){

                    case $numer['image']: {
                        //Необходимо получить ячейку изображения и проверить его rowspan. записать его rowspan в отдельную переменную и если он не равен 1, то прибавлять +1 переменной num, чтобы
                        // не сбивался порядок ячеек в таблице при считывании данных.
                        if (isset($value->rowspan)) {
                            $rowspan = (int) $value->rowspan;
                            //var_dump($rowspan);
                        }
                        if (isset($value->find('img')[0])){
                            $image_path = $value->find('img')[0]->src;

              }//;

                        //var_dump($image_path);
                        // $image = file_get_contents($image_path);
                        // file_put_contents($image["tmp_name"], $path_to_images);
                        //$image_addr = $path_to_images . $image;
                    }
                        break;
                    case $numer['model']: $article = $value->plaintext;
                        break;
                    case $numer['name']: $name = $value->plaintext;
                        break;

                }



            }
      if(isset($article) && $article !== '' && $article !== '  ')
      {
        $arr = explode("/", $image_path); 
                $imagen = end(str_replace(" ","",$arr));
        //file_put_contents(__DIR__ . _DS_ . "catalog" . _DS_ . $imagen, $this->link . $image_path);
        file_put_contents( 'catalog\\' . $imagen, file_get_contents('http://tdme.ru' . $image_path));

        //INSERT INTO base (name, image, description, article) VALUES ('6', '1st Street', 'Los Angeles', 'Harry Monroe')
           //$parsed_string = 'UPDATE tdm_upload SET `name` = \'' . $name . '\' `image` = \'catalog/'.   $article .   '\' `description` = \''.   $description .   '\' WHERE `article` = \'' . $article . "'; \n";
      $parsed_string = 'INSERT INTO base (image, description, article) VALUES (\'catalog\\'.$imagen.'\', \''. $description .'\', \''.$article."'); \n";
            fwrite($sql_file, iconv("WINDOWS-1251", "UTF-8", "$parsed_string") ); 


      }

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
A person from Kazakhstan, 2016-08-05
@LenovoId

But you can't just use ready-made libraries?
here is the content parser itself: https://code.google.com/archive/p/phpquery/

D
Dmitry, 2016-08-06
@another_dream

Check After

foreach (($table_header->children) as $num => $th_row) {
            //  echo "FIRST foreach<br>";
            //var_dump($th_row->plaintext);
            //var_dump($num);
            switch(trim($th_row->plaintext)){
                case 'Наименование':
               // case 'Наименование светильника':
                    $numer["name"] = $num;
                    break;
                case 'Фото':
                case 'Изображение':
                    $numer['image'] = $num;
                    break;
                case 'Артикул':
                    $numer['model'] = $num;
                    break;
            }

        }

What is contained in $numer (dump) and see if everything is processed correctly.
Then go to the loop foreach ($arr_of_tl as $num => $value) {...}
Inside it, also check the data, for example - $num and $value .
In general, you should not cycle similar designs, look at ready-made packages and use them. Installing Composer and connecting autoload is a matter of two minutes.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question