Answer the question
In order to leave comments, you need to log in
How to redirect a spider to an external resource?
There is a spider, it parses a page with information, and it has a certain URL to an external resource. I need the spider to go to that resource and take some information from there.
Here's what I have now:
def parse_start_url(self, response):
url = response.xpath("....xpath....").extract()
if shop_url:
yield Request(url + 'Additional address part' , callback=self.parse)
def parse(self, response):
sel = HtmlXPathSelector(response)
l = TestLoader(TestItem(), sel)
l.add_xpath('test', "......xpath... .")
return l.load_item()
Answer the question
In order to leave comments, you need to log in
solved it like this:
def parse_start_url(self, response):
url = response.xpath("__xpath___").extract()
yield Request("%url_part" % url[0], callback=self.parse_url)
def parse_url(self, response):
item = TestItem()
item['telephone'] = response.xpath('__xpath____').extract()
return item
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question