A
A
arsenaljek2019-02-07 06:35:14
PHP
arsenaljek, 2019-02-07 06:35:14

Writing large xml to MYSQL?

Help with a problem.
There are xml directories. 14 pieces. Their total volume is 1.4GB. I ship items from there. The structure is plain YML.
The script downloads them to the site, then iterates over each file in a foreach loop.
And since the script also has conditions that if the current price that is being parsed differs from the price for yesterday, then a recording is in progress, otherwise we do not record. That script execution takes an unrealistically long time. I'm not even sure that he manages to go through everything at all, because the limitations of the script are limited to 10 hours.

$dir = "xml/";
$files = scandir($dir);
$files = array_diff(scandir($dir), array('..', '.')); 
 foreach ($files as $key => $value) { 
   $z->open('xml/'.$value);
  $doc = new DOMDocument;
  while ($z->read() && $z->name !== 'offer');
  while ($z->name === 'offer') {
    $node = simplexml_import_dom($doc->importNode($z->expand(), true));
    $id = mysqli_real_escape_string($link6,$node['id']);
    $url = mysqli_real_escape_string($link6,$node->url);
    $price = $node->price;
    $sql = 'INSERT INTO products (`id`,`url`,`date`) VALUES ("'.$id.'","'.htmlspecialchars(urldecode($url)).'","'.$today.'")';
  $result = mysqli_query($link6,$sql);
  $new_price = (int)$price;

  $sql = 'SELECT * FROM prices WHERE id = '.$id.'  ORDER BY date DESC ';
  $result = mysqli_query($link6,$sql);
  $row = mysqli_fetch_array($result);
  if ($row['id'] == $id AND $row['date'] == $today_new OR $row['price_product'] == $price) {
    // echo 'все совпало</br>';
    $articles1[] = $row['id'];
  } else {
    $articles2[] = $row['id'];
    // echo 'несовпало</br>';
    if ($row['price_product'] == $new_price) {
      // echo 'не пишем дата сегоднящная уже записана';
    } else {
      $sql = 'INSERT INTO prices (`id`, `price_product`, `date`) VALUES ("'.$id.'","'.(int)$price.'","'.$today.'")';
      $result = mysqli_query($link6,$sql);
    }
  }
  $z->next('offer');
  }
 }

Can you suggest how to optimize. Increasing the speed of writing to the database? And is it right to do what I do?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
R
Rsa97, 2019-02-07
@arsenaljek

Buzz. Making a selection in a loop for each row, and even generating a query each time, is probably the worst possible option.
Add a `price` column to the `products` table. You hang triggers on this table for inserting and changing a row, which, when adding or changing a price, enter it into the `prices` table. You do not work directly with the `prices` table from this script.
In a cycle, you form and accumulate data sets for insertion. As soon as a certain amount accumulates, insert them in one request. Use ON DUPLICATE KEY UPDATE to overwrite changed values. At the end of the cycle, unload the remaining sets.

More or less like this
$data = [];
while ($z->name === 'offer') {
  ...
  $data[] = '("'.$id.'","'.htmlspecialchars(urldecode($url)).'","'.$today.'",'.(int)$price.')';
  if (count($data) > 99) {
    $sql = 'INSERT INTO `products` (`id`,`url`,`date`,`price`) VALUES ' 
         . implode(',', $data)
         . 'ON DUPLICATE KEY UPDATE `url` = VALUES(`url`), `price` = VALUES(`price`)';
    mysqli_query($link6,$sql);
    $data = [];
  }
  ...
}
if (count($data) > 0) {
  $sql = 'INSERT INTO `products` (`id`,`url`,`date`,`price`) VALUES ' 
       . implode(',', $data)
       . 'ON DUPLICATE KEY UPDATE `url` = VALUES(`url`), `price` = VALUES(`price`)';
  mysqli_query($link6,$sql);
}

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question