Answer the question
In order to leave comments, you need to log in
How to find sentences containing a URL on a web page?
The task is this: to find in the html markup all sentences that contain at least one URL-like substring.
url can be of the form aaa.bbb....(/dir/page/?asdf) - the following expression \S*?\.([az.])+(/.*?\s)?) is suitable for them.
The difference between links and non-links is not important, sentences can contain tags, etc.
I want to understand whether it is possible to implement such an algorithm using regular expressions (and without additional coding in the language):
I find the URL, for example, according to the specified pattern, then I search back to the first combination of dot + space character and search forward to the same combination, and everything that turned out between these positions I get as a result.
PS. I'm using Python, but any compatible engine will do.
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question