Answer the question
In order to leave comments, you need to log in
How to make an automatic parser?
There is this code:
import telebot
import config
from time import sleep
from bs4 import BeautifulSoup
import requests
bot = telebot.TeleBot(config.token)
@bot.message_handler(commands = ['start'])
def start(message):
html = requests.get("https://www.rbc.ru/short_news")
soup = BeautifulSoup(html.text, 'lxml')
title = soup.find('span', class_ = 'item__title-wrap')
href = soup.find('div', class_ = 'item__wrap l-col-center')
while html.status_code == 200:
for t in title.find_all('span', class_ = 'item__title rm-cm-item-text')[:1]:
answer_title = t.text.strip()
print(answer_title)
for h in href.find_all('a', class_ = 'item__link')[:1]:
answer_href = h.get('href')
print(answer_href)
bot.send_message(message.chat.id, f'{answer_title}\n\n{answer_href}')
sleep(5)
if __name__ == '__main__':
bot.polling(none_stop = True)
Answer the question
In order to leave comments, you need to log in
1. How to parse not the latest news, but any (for example, penultimate)
2. And how to make a check for new news so that the program understands that new news has come out and immediately parses it.
I also found that the same news is parsed with the timer. That is, the program is running, the news is parsed, and after the specified time interval, even if new news appears on the site, the same news will be parsed until I restart the program.
html = requests.get("https://www.rbc.ru/short_news")
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question