Answer the question
In order to leave comments, you need to log in
Python bs4 select desired element not having a class?
Hello. I was faced with the task of writing a small parser in python.
But there were problems with extracting the desired text. The trick is that the desired text is in a div element that has neither an ID nor a class. It is really difficult to get it without using crutches.
Maybe someone knows a more or less universal solution for such cases.
Here's one of the links
I'm talking about the div that contains the lyrics.
Answer the question
In order to leave comments, you need to log in
Shit-making requires desperate measures.
soup.find("div", string="Usage of azlyrics.com content by any third-party lyrics provider is prohibited by our licensing agreement. Sorry about that.")
import requests
from bs4 import BeautifulSoup
def get_html(url):
r = requests.get(url)
return r.text
def get_data(html):
soup = BeautifulSoup(html, 'lxml')
divs = soup.find_all('div')
return divs[21].text
def main():
url = 'https://www.azlyrics.com/lyrics/imaginedragons/roots.html'
print(get_data(get_html(url)))
if __name__ == '__main__':
main()
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question