Why is BeautifulSoup returning an empty list to me?

Y

Yevgeni2018-05-21 21:35:30

Python

Yevgeni, 2018-05-21 21:35:30

When I did such manipulations outside the class (procedurally), there were no problems with extracting links.
As soon as I created a class, everything stopped working.
After executing find_all I get an empty list.
Before that, I parsed 200 pages without any problems, but here you are! out of the blue :(
PS: soup.prettify gives the html code of the page. that is, the problem is in the find_all method

import requests
from bs4 import BeautifulSoup
import json
from datetime import datetime, timedelta
from textwrap import shorten


class SSLV:
    def __init__(self):
        self.root_url = r'https://ss.lv'
        self.vacancies_url = r'https://www.ss.com/ru/work/are-required/filter/page1.html'

    def make_request(self, url):
        r = requests.get(url)
        html = r.content
        return html

    def make_soup(self, html_code):
        soup = BeautifulSoup(html_code, 'html.parser')
        return soup

    def calculate_pagination(self, url):
        pass

    def find_links(self):
        # total_pages = self.calculate_pagination(self.vacancies_url)
        # for p in range(total_pages):
        #     pass
        html = self.make_request(self.vacancies_url)
        soup = self.make_soup(html)
        # print(soup.prettify())
        links = soup.find_all('a', class_='am')
        print(links)


ss = SSLV()
ss.find_links()

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

A

aderes, 2018-05-22
@de-iure

there is no find_all() method... where is it?
or should there be findall() here?
like this it works:

import re
links = re.findall(r'<a.*>.*</a>', str(soup))
print(links)

L

lug32, 2018-05-31
@lug32

{'class':'am'}