Answer the question
In order to leave comments, you need to log in
How to replace links in the text in C#, excluding tag parameters?
You need to replace all links in the text with the <a href=""></a> tag. In this case, you do not need to change links that are tag attributes (for example: <a href="http:// toster. ru/">Toaster</a> or <img src="http:// mysite. com/photo. jpg" />). (I put spaces because the toaster turns on its intellect)
Sample text:
Принцип восприятия http://google.ru непредвзято создает www.ya.ru
паллиативный интеллект, [email protected] условно. Концепция
<a href="http://mail.ru">ментально</a> оспособляет
<img src="http://bing.com/images/01.jpg" /> закон внешнего мира.
Принцип восприятия <a href="https://google.ru/">https://google.ru</a>
непредвзято создает <a href="http://www.ya.ru/">www.ya.ru</a>
паллиативный интеллект, [email protected] условно. Концепция
<a href="http://mail.ru">ментально</a> оспособляет
<img src="http://bing.com/images/01.jpg" /> закон внешнего мира.
private static Regex regExHttpLinks = new Regex(@"(?<=\()\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|](?=\))|(?<=(?<wrap>[=~|_#]))\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|](?=\k<wrap>)|\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|]",
RegexOptions.Compiled | RegexOptions.IgnoreCase);
public static string ParseHtml(this string source)
{
if (string.IsNullOrEmpty(source))
return source;
var periodReplacement = "[]";
source = Regex.Replace(source, @"(?<=\d)\.(?=\d)", periodReplacement);
var linkMatches = regExHttpLinks.Matches(source);
foreach (Match match in linkMatches)
{
var m = match.ToString();
string s = (m.Contains("://")) ? m : "http://" + m;
source = source.Replace(m,
String.Format("<a href=\"{0}\" title=\"{0}\">{1}</a>",
s.Replace(".", periodReplacement).ToLower(),
m.Replace(".", periodReplacement)));
}
source = source.Replace(periodReplacement, ".");
return source;
}
Answer the question
In order to leave comments, you need to log in
Good afternoon.
A space because of the third piece of the regular expression, which does not check the context in which it is in any way. If you add negative previews ahead for the presence of tags, you should get what you need.
Check it out like this
(?<=\()\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|](?=\))|(?<=(?<wrap>[=~|_#]))\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|](?=\k<wrap>)|(?<!((a\shref=\")|(img src=")))\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|]
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question