Answer the question
In order to leave comments, you need to log in
How to find a link in a string completely?
I need to find and cut the desired link in a string. Now I find links like this:
pattern = r'<a rel="(.+?)">'
s = re.findall(pattern, item.content)
for string in s:
...
Answer the question
In order to leave comments, you need to log in
Alternatively (not the only possible solution):
pattern = r'(<a rel=")(.+?)(">)'
splitted = re.split( pattern, html_str )
# splitted == [ '<html>...', '<a rel="', 'http://site.com/image1.jpg', '">', '<div>...', '<a rel="', 'http://site.com/image2.jpg', '">', ... ]
urls = splitted[2::4]
# urls == ['http://site.com/image1.jpg', 'http://site.com/image2.jpg', ... ]
[ '<a rel="', 'http://site.com/image2.jpg', '">' ]
from the splitted list or replace them with something (for example, "link name"). cleaned_html_str = ''.join(splitted)
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question