How to make scrapy crawl a page every hour?

S

slavamironoff2019-08-18 09:41:00

Python

slavamironoff, 2019-08-18 09:41:00

Hello ladies and gentlemen.
There was a need to write a telegramm bot - a parser, freelance orders for python.
I took 2 libraries scrapy and tele-bot, they communicate with JSON.
The idea is this:
- the user writes the start command, the bot sends a message that now I will send you information on orders.
Looks at a list of data that it extracts from JSON, which in turn was written by scrapy.
The question is, how do I get scrapy to crawl a page every hour?
I know that it is possible to use But it is necessary that that like an infinite loop. Tried:time.sleep(3600)

while True:
    time.sleep(2*60)
    MainSpidner()

But unsuccessfully.
Here is the parser code:

import scrapy
import json
import time


class MainSpidner(scrapy.Spider):
    ''' Класс запросов к freelance.ru '''

    name = 'FLforPython'


    def start_requests(self):

        URLS = ["https://freelance.ru/projects/?cat=4&spec=446"]

        for URL in URLS:
            yield scrapy.Request(url=URL, callback=self.parse)
    def parse(self, response):

        titleLists = []     # Список заказ-заголовков
        textList = []       # Сприсок описаний заказов
        priceList = []      # Список цен за заказы

        for item in response.css("div.p_title h2 a.ptitle span::text").getall():
            titleLists.append(item)
            set(titleLists)


        for item in response.css("a.descr p span::text").getall():
            textList.append(item)

        
        for item in response.css("a.descr p span b::text").getall():
            priceList.append(item)


        if titleLists is not None and textList is not None and priceList is not None:

            orders = {
                "title" : "",
                "text" : "",
                "price" : ""
            }

            for title in titleLists[:1]:
                orders["title"] = title

            for text in textList[:1]:
                orders["text"] = text
            
            for price in priceList[:1]:
                orders["price"] = price
            
            # Сохранение информации в JSON
            with open("../../items.json", 'w') as f_obj:
                json.dump(orders, f_obj)

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

A

Astrohas, 2019-08-18
@Astrohas

Cron