S
S
Sergey Nizhny Novgorod2016-03-08 13:11:01
Django
Sergey Nizhny Novgorod, 2016-03-08 13:11:01

Why does Python swear at a variable?

Guys, hello everyone.
I am creating a parser in django.
Logic:
I pass the necessary variables through the form, they are picked up by the parse_one function - it scans the elements I need, sets the link to the next element in the global variable, and returns the data. The link to the next element is picked up by the parse_two function, and so on in a loop.
If we shove everything into one file and run this file, then everything works out. If we insert it into the view like this, then when we try to run it, we get an error:
UnboundLocalError at /parse
local variable 'agent_client' referenced before assignment
Request Method: POST
Request URL: 127.0.0.1:8000/parse
Django Version: 1.9.2
Exception Type: UnboundLocalError
Exception Value:
local variable 'agent_client1' referenced before assignment Exception Location: E :\ jivofolder
\marketing\views.py in parse_two, line 90 \marketing\views.py in parse_two "agent_client" : agent_client, i.e. system complains about news.append in parse_two


#Константы
LOGIN_URL = "Ссылка логина"
URL = "Ссылка сайта"

#Парсер первой странички
def parse_one(USERNAME, PASSWORD, dialogue_url):

    session_requests = requests.session()

            # Create payload
    payload = {
        "email": USERNAME,
        "password": PASSWORD
    }

    # Perform login
    result = session_requests.post(LOGIN_URL, data = payload, headers = dict(referer = LOGIN_URL))

    # Scrape journal_url
    result = session_requests.get(URL, headers = dict(referer = URL))
    soup = BeautifulSoup(result.content)
    g_data = soup.find_all('a', class_ = 'icon note')[0].get('href')
    journal_url = "Ссылка элемента" + g_data

    # Scrape journal
    result2 = session_requests.get(dialogue_url, headers = dict(referer = dialogue_url))
    soup2 = BeautifulSoup(result2.content)
    try:
        agent_client = soup2.find_all('h2')[0].text
    except:
        pass
    try:
        information = soup2.find_all('div', class_ = "content block")[0].text
    except:
        pass
    try:
        dialogue_log = soup2.find_all('table', class_ = "table zeropadding transparent")[0].text
    except:
        pass
    news = []
    news.append({
            "agent_client" : agent_client,
            "information" : information,
            "dialogue_log" : dialogue_log,
        })
    next_dialogue_clear = soup2.find_all('a', id ="next")[0].get('href')
    global next_dialogue
    next_dialogue = "Ссылка следуюдешо элемента" + next_dialogue_clear
    return news

#Парсер следующих страничек
def parse_two(USERNAME, PASSWORD):

    session_requests = requests.session()

            # Create payload
    payload = {
        "email": USERNAME,
        "password": PASSWORD
    }


    result2 = session_requests.get(next_dialogue, headers = dict(referer = next_dialogue))
    soup2 = BeautifulSoup(result2.content)
    try:
        agent_client = soup2.find_all('h2')[0].text
    except:
        pass
    try:
        information = soup2.find_all('div', class_ = "content block")[0].text
    except:
        pass
    try:
        dialogue_log = soup2.find_all('table', class_ = "table zeropadding transparent")[0].text
    except:
        pass
    news = []
    news.append({
            "agent_client" : agent_client,
            "information" : information,
            # "dialogue_log" : dialogue_log,
        })
    next_dialogue_clear = soup2.find_all('a', id ="next")[0].get('href')
    next_dialogue = "Ссылка следуюдщего элемента" + next_dialogue_clear
    global next_dialogue
    return news

######################################

#Функция, которая вызывается в django

def parse(request):
    done = csrf(request)
    if request.POST:
        USERNAME = request.POST.get('logins', '')
        PASSWORD = request.POST.get('password', '')
        dialogue_url = request.POST.get('links', '')
        total_pages = int(request.POST.get('numbers', ''))
        news = []
        news.extend(parse_one(USERNAME, PASSWORD, dialogue_url))
        for page in range(2, total_pages + 1): # вот с этим проблема, если закоментить, то parse_one отрабатывает на ура.
            news.extend(parse_two(USERNAME, PASSWORD))
        contex = {
                    "news" : news,
                }
        done.update(contex)
        return render(request, 'marketing/parser.html', done)

Ps do not kick for copy-paste while you need to get the functionality, and then finish it.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
N
nirvimel, 2016-03-08
@nirvimel

try:
    some_variable = some_function()
except:
    pass

Never do that! First, except must catch a specific exception, not all in a row. Secondly, if some_variable is first initialized in the try block (it was not assigned a value before), then instead of pass, you must set some_variable to some default value. For example:
try:
    some_variable = some_function()
except some_exception:
    some_variable = ""  # если some_function() должна возвращать строку
    # some_variable = 0  # если some_function() должна возвращать целое число
    # some_variable = []  # если some_function() должна возвращать список
    # some_variable = None  # если some_function() может вернуть None

S
sim3x, 2016-03-08
@sim3x

Parsing an unpredictable action in time - you don’t need to put it close to the view
Do https://docs.djangoproject.com/en/1.9/howto/custom...
and pull it through the command line or inside django
Error, most likely in typo about agent_client variable

B
Bulat Kurbangaliev, 2016-03-08
@ilov3

I agree with sim3x about the response time, but it's better to use a queue, for example, https://github.com/ui/django-rq

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question