How to parse titles in python?

E

Evgeny Ivanov2017-03-23 13:44:02

Python

Evgeny Ivanov, 2017-03-23 13:44:02

import urllib.request
from bs4 import BeautifulSoup
def get_html(url):
responce = urllib.request.urlopen(url)
return response.read()
def parse(html):
soup = BeautifulSoup(html, 'html.parser')
table = soup.find('table', cellspacing="1", class_='ipbtable')
for row in table.find_all('tr')[6:]:
cols = row.find_all('td')[2:]
print(cols)
def main():
parse(get_html('site'))
if __name__ == '__main__':
main()

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

G

gill-sama, 2017-03-23
@gill-sama

and why do you need to search for div separately?

for row in table.find_all('tr')[6:]:
        cols = row.find_all('td')[2:]
        if cols:
            print(cols[0].text)

if this option works, and rips out the text, if the text is not what you need, then you can go further through find / find_all into the DOM tree