Answer the question
In order to leave comments, you need to log in
Where to insert a Python timer to make it work?
I think I found a solution with getting the desired html page. Previously, scarpi was getting empty html.
Those. now, before collecting data from the page, scrapie will wait 5 seconds (during this time, the JS code will have time to request the required html) and the necessary information will be collected.
Hepnite how to be with the insertion of the timer pliz.
# -*- coding: utf-8 -*-
import scrapy
from threading import Timer
class ExampleSpider(scrapy.Spider):
name = 'bc'
start_urls = [
'https://www.greatcircus.ru/',
]
def parse(self, response):
for ticket in response.css('.col-xs-12 schedule-main-tickets-container'):
event_name = ticket.css('.schedule-main-tickets-show-title::text').extract(),
place = ticket.css('.schedule-main-tickets-location::text').extract(),
url = ticket.css('.text-center a::text').extract(),
yield {
'event_name': event_name,
'place': place,
'url': url,
}
t = Timer(5.0, parse)
t.start()
Exception in thread Thread-1:
Traceback (most recent call last):
File "C:\programs for work\lib\threading.py", line 917, in _bootstrap_inner
self.run()
File "C:\programs for work\lib\threading.py", line 1166, in run
self.function(*self.args, **self.kwargs)
TypeError: parse() missing 2 required positional arguments: 'self' and 'response'
Answer the question
In order to leave comments, you need to log in
You correctly wrote in the comment that it is useless to wait here - scrapie will not execute js.
According to the code - why do you need a timer from threading, it seems that sleep (5) is enough for your task. But it still won't help if your html is changed by js after loading - you need selenium.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question