Scrapy - how to collect data from all products on the page one by one?

M

maximkalga2015-02-06 11:42:57

Python

maximkalga, 2015-02-06 11:42:57

I need to scrape the price and name of each product on the page in the product catalog and then go to the next pagination page.
So far, it turns out that all prices are written in 1 item list, and the names as well. If you use TakeFirst(), then 1 product will be scraped from each page.
How to iterate through the products of a page? And how to set the transition to the next page after the passage of all products?
def parse(self, response):
sel = HtmlXPathSelector(response)
item = PromItem()
item['name'] = sel.select('//a[@class="b-product-gallery__product-name-link"] /span/text()').extract()
item['price'] = sel.select('//div[@itemprop="price"]/span[2]/text()').extract()
item['url'] = sel.select('//a[@class="b-product-gallery__product-name-link"]/@href').extract()
return item

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

E

egorsmkv, 2015-02-06
@egorsmkv

First, you need to select all the products on the page with an xpath query.
For example, here the query will look like this //div[@class='bOneTile inline']
. After that, iterate them through for and, again, select the price and name for each element through the xpath query:

# название
a[@class='jsUpdateLink bOneTile_link']
# цена
//span[@class='eOzonPrice_main']

To go to the next page, you need to find a link for it and parse the product data again.