How to parse a table using BP?

S

san_m_m2020-11-13 11:24:10

Python

san_m_m, 2020-11-13 11:24:10

You need to take the table data from the site ( https://mintrans.novreg.ru/perm/list.html ).

Wrote code:

import requests
from bs4 import BeautifulSoup  


links = []
for i in range(4):
    r = requests.get(f'https://mintrans.novreg.ru/perm/list%7Bpage-{i}%7D.html')
    soup = BeautifulSoup(r.text, 'html.parser')
    data = soup.find('table')
    links.append(data)

But it gives me 4 identical lists... And if you manually change the page numbers on the site, then the data changes.
How to solve this problem?

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

S

soremix, 2020-11-13
@san_m_m

There are not enough cookies that determine that there are only 10 results (or more) on the page, respectively, if there are none, there are no pages.
I did it, everything works

import requests
from bs4 import BeautifulSoup  

s = requests.Session()

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36'}

s.get('https://mintrans.novreg.ru/perm/list.html', headers=headers, cookies={'abc': 'all', 'count': '10'})

links = []
for i in range(1, 4):
    r = s.get(f'https://mintrans.novreg.ru/perm/list%7Bpage-{i}%7D.html', headers=headers, cookies={'abc': 'all', 'count': '10'})
    soup = BeautifulSoup(r.text, 'html.parser')
    data = soup.find('table')
    links.append(data)