How to handle AttributeError: 'NoneType' object has no attribute 'text'?

B

Bjornie2016-11-28 14:36:45

Python

Bjornie, 2016-11-28 14:36:45

When parsing, the error "AttributeError: 'NoneType' object has no attribute 'text'" is periodically thrown , because the parser cannot find the desired element by the selector and learn text from it.
I get the data like this:
fax = soup.find(id="ctl0_left_fax")
Then I add it to the array (the script stumbles on this): 'fax': fax.text.strip(),
I tried to check:

#  Проверяю, если у страницы TITLE пустой, значит там нечего парсить. ХОЧУ ПРОПУСТИТЬ ДАЛЬНЕЙШИЙ  ПАРСИНГ
if (len(soup.title.text.strip()) == 15) or (soup.title.text.strip() == 'testtesttest -'):
    exit
#  Если у страницы TITLE <16 (значит там есть какой-то контент), то посмотреть текст у селекторов
else:
    list = {
               'cap': cap.text.strip(),
               'fax': fax.text.strip(),
               'email': email.text.strip(),
        }

But for some reason it is ignored (although I checked it - it works).
I will add that I implemented a similar algorithm in PHP and everything worked for me.
Moreover, at the stage of adding to the array, I DIRECTLY IN IT checked through the returned data type through the ternary operator, but this does not work in python.

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

R

Roman Kitaev, 2016-11-28
@deliro

BeautifulSoup is designed in such a collective way that it returns None if it doesn't find an element, rather than raising an exception or a dummy element, so every element you look for needs to be checked. Those.:

fax = soup.find(id="ctl0_left_fax")
if fax:
    another_element = fax.find(class_='some_class')
    if another_element:
        another_one = another_element.find(class_='some_another_class')
        if another_one:
            do_something()

Or a coolhacker solution to avoid branching:

fax = soup.find(id="ctl0_left_fax")
another_element = fax and fax.find(class_='some_class')
another_one = another_element and another_element.find(class_='some_another_class')
if another_one:
    do_something()

There is only one advice here: do not use BeautifulSoup.

K

kukharev_a, 2020-03-15
@kukharev_a

Handle exceptions

try:  

except: