Why is the while loop not working?

M

Markus-Zeyfert2020-10-29 11:09:48

Python

Markus-Zeyfert, 2020-10-29 11:09:48

It is necessary that the program monitors the latest announcement on the site. Wrote the following code (experienced, just forgive me for being outrageous, I'm studying programming for the second day).
In my theory, the refrash() function should be executed until the data received in the new variable is no longer equal to the given item variable.
But I don't seem to be doing it right...

import requests
from bs4 import BeautifulSoup

URL = 'https://www.avito.ru/novosibirsk/kvartiry/prodam-ASgBAgICAUSSA8YQ?cd=1&f=ASgBAQICAUSSA8YQAUCQvg0Ulq41&proprofile=1&s=104'
HEADERS = {'user-agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36', 'accept': '*/*'}
HOST = 'https://www.avito.ru'

item = ''
new = ''

def get_html(url, params=None):
    r = requests.get(url, headers=HEADERS, params=params)
    return r

def get_content(html):
    soup = BeautifulSoup(html, 'html.parser')
    global item
    item = soup.find('div', class_='item__line')
def parse():
    html = get_html(URL)
    if html.status_code == 200:
        get_content(html.text)
    else:
        print('Error')

parse()

def refrash():
    def get_html(url, params=None):
        r = requests.get(url, headers=HEADERS, params=params)
        return r
    def get_content(html):
        soup = BeautifulSoup(html, 'html.parser')
        global new
        new = soup.find('div', class_='item__line')
    def parse():
        html = get_html(URL)
        if html.status_code == 200:
            get_content(html.text)
        else:
            print('Error')
    parse()

while new== item:
    refrash()

Reply

Answer the question

In order to leave comments, you need to log in

5 answer(s)

M

MorganDusty, 2020-10-29
@MorganDusty

refrash you say?)
PS: IN CODE "REFRASH" CHANGE TO "REFRESH"

S

soremix, 2020-10-29
@SoreMix

It is not clear, of course, why you have duplicated functions, and the functions are in the functions, but oh well .
That's right, the cycle works for now new == item. The function executed parse(), wrote the new content to the variable item, while the variable newremained old and equal to the empty string. As a result, item != new and the loop failed and your reflash was never executed

E

Elvis, 2020-10-29
@Dr_Elvis

Doesn't work correctly.
You still BEFORE the code reaches the while already in item is not empty.

K

Kra1ven, 2020-10-29
@Kra1ven

Because by the time the code gets to the loop, new is not equal to item. Also, why do you need repeated code, remove functions that are before "refrash"

N

Nick, 2020-10-29
@c00re

You don't need to re-declare get_html, get_content and parse functions in refrash

import requests
from bs4 import BeautifulSoup

URL = 'https://www.avito.ru/novosibirsk/kvartiry/prodam-ASgBAgICAUSSA8YQ?cd=1&f=ASgBAQICAUSSA8YQAUCQvg0Ulq41&proprofile=1&s=104'
HEADERS = {'user-agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36', 'accept': '*/*'}
HOST = 'https://www.avito.ru'

item = ''
new = ''

def get_html(url, params=None):
    r = requests.get(url, headers=HEADERS, params=params)
    return r

def get_content(html):
    soup = BeautifulSoup(html, 'html.parser')
    item = soup.find('div', class_='item__line')
    return item
def parse():
    html = get_html(URL)
    if html.status_code == 200:
        item = get_content(html.text)
        return item
    else:
        print('Error')

def refrash():
    new = parse()
    return new

item = parse()

while refrash() == item:
    refrash()

I also changed something here. In refrash, you are essentially doing the same thing as in parse. You can do parse once, and then check with the result of refrash (although you can do without it):

import requests
from bs4 import BeautifulSoup

URL = 'https://www.avito.ru/novosibirsk/kvartiry/prodam-ASgBAgICAUSSA8YQ?cd=1&f=ASgBAQICAUSSA8YQAUCQvg0Ulq41&proprofile=1&s=104'
HEADERS = {'user-agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36', 'accept': '*/*'}
HOST = 'https://www.avito.ru'

item = ''
new = ''

def get_html(url, params=None):
    r = requests.get(url, headers=HEADERS, params=params)
    return r

def get_content(html):
    soup = BeautifulSoup(html, 'html.parser')
    item = soup.find('div', class_='item__line')
    return item
def parse():
    html = get_html(URL)
    if html.status_code == 200:
        item = get_content(html.text)
        return item
    else:
        print('Error')

item = parse()

while parse() == item:
    parse()