Answer the question
In order to leave comments, you need to log in
How can I quickly parse more than 1000 images from a website?
It is necessary to parse more than 1000 images from the site. At the moment I'm using simple html dom, which can't parse the site at all. Tell me how you can do this, if not simple html dom, then maybe some other parser.
Answer the question
In order to leave comments, you need to log in
I advise the PHPQuery library, it does not have such glitches as simple html dom (I tried both tm and themes, but I liked phpquery).
Links to lessons:
habrahabr.ru/post/69149
i-novice.net/parsim-sajty-s-phpquery
Recently, I just parsed pictures with this library and it did a very good job
. In order to save a specific picture, you need to use libraries to find links to pictures, I searched the page and put all found links in an array, example code:
$model_page_url = file_get_contents($page); //Получаем всю страницу
$model_page = phpQuery::newDocument($model_page_url); //Создаём объект страницы библиотекой
$images_link = $model_page->find('img'); //Ищем все теги img
foreach ($images_link as $image_link) {
$images[] = pq($image_link)->attr('src'); //В цикле помещаем ссылку на картинку в массив
}
foreach($images as $image){
$image_name = basename($image); //Определяем имя и расширение картинки
if(!file_exists('img/'.$image_name)){ //Проверяем нет ли такой картинки
file_put_contents('img/'.$image_name, file_get_contents($image)); //через file_get_contents($image) получаем картинку по ссылке и file_put_contents кладём её в нужную нам папку
}else{
continue;
}
}
SimpleHTMLDOM is an excellent library, very easy to use, the principle of operation is very similar to jQuery or CSS selectors.
Below is the code demonstrating the download of images from the merlion distributor's website:
<?
$simple = file_get_html('http://merlion.com/catalog/product/966656');
foreach ($simple->find('div.ad-thumbs .ad-thumb-list li a') as $el){
echo $el->href.'<br>';
}
http://img.merlion.ru/items/966656_v01_m.jpg
http://img.merlion.ru/items/966656_v02_m.jpg
http://img.merlion.ru/items/966656_v03_m.jpg
http://img.merlion.ru/items/966656_v04_m.jpg
http://img.merlion.ru/items/966656_v05_m.jpg
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question