@
@
@nixbox2019-01-29 09:50:45
Python
@nixbox, 2019-01-29 09:50:45

Beautiful soup: how to find an html element containing a certain string if the tag is not known in advance?

Hello.
It is required to find the element containing the string.
Now the code looks like this:

html_page = urlopen(page_link)
soup = BeautifulSoup(html_page, "html.parser")
      
for tag in soup.find_all():
    if 'string' in tag.text:
       print(tag.text)

The problem is that this approach returns all tags containing the string, including all parent tags starting with body.
How can one get only the tag directly containing the string without parent elements?
The task of finding an element: the ability to later move to child and parent elements.
Maybe you can suggest other ways, without using Beautiful soup?
Thank you.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
N
nixbox, 2019-01-30
_

As Anton wrote, Beautiful soup only finds a string.
There is no possibility of searching for a string without specifying a tag and manipulating it.
To solve my problem, I wrote separate search functions for all possible tags.

A
Anton Fedoryan, 2019-01-29
@AnnTHony

import re

soup.find_all('название тега', string=re.compile('искомая строка'))
# или
soup.find_all('название тега', text=re.compile('искомая строка'))

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question