Answer the question
In order to leave comments, you need to log in
How to make the parser output only text and links without Html markup?
Apologies in advance for the shitty code, I'm just getting started :)
import requests
import bs4
import lxml
url = '*page_link*'
r = requests.get(url=url)
soup = bs4.BeautifulSoup(r.text, 'lxml')
quotes = soup.find_all('url', class_='*class_name*')
href = soup.find_all('a', class_ = '*class_name*')
print(quotes, href)
Answer the question
In order to leave comments, you need to log in
import requests
import bs4
import lxml
url = 'https://qna.habr.com'
r = requests.get(url=url)
soup = bs4.BeautifulSoup(r.text, 'lxml')
# quotes = soup.find_all('url', class_='*class_name*')
href = soup.find_all('a', class_ = 'question__title-link')
# print(quotes, href)
for x in href:
link = x.get('href') # Получаем адрес ссылки
text = x.text.strip() # Получаем текст ссылки и убираем лишние пробелы и переносы строк
print(text+' - '+link)
Как запустить ffmpeg на GPU golang? - https://qna.habr.com/q/1033160
Стенд для изучения DevOps на базе Linux-серверов. С чего начать изучение? - https://qna.habr.com/q/1033364
...
Предварительная загрузка изображений wordpress? - https://qna.habr.com/q/1033300
Не могу зарегистрировать аккаунт стим через свой домен. Что делать? - https://qna.habr.com/q/1033248
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question