V
V
VitaliySm2015-02-10 21:57:39
Python
VitaliySm, 2015-02-10 21:57:39

How to redirect a spider to an external resource?

There is a spider, it parses a page with information, and it has a certain URL to an external resource. I need the spider to go to that resource and take some information from there.
Here's what I have now:
def parse_start_url(self, response):
url = response.xpath("....xpath....").extract()
if shop_url:
yield Request(url + 'Additional address part' , callback=self.parse)
def parse(self, response):
sel = HtmlXPathSelector(response)
l = TestLoader(TestItem(), sel)
l.add_xpath('test', "......xpath... .")
return l.load_item()

Answer the question

In order to leave comments, you need to log in

1 answer(s)
V
VitaliySm, 2015-02-11
@VitaliySm

solved it like this:
def parse_start_url(self, response):
url = response.xpath("__xpath___").extract()
yield Request("%url_part" % url[0], callback=self.parse_url)
def parse_url(self, response):
item = TestItem()
item['telephone'] = response.xpath('__xpath____').extract()
return item

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question