Answer the question
In order to leave comments, you need to log in
How to parse HTML using HttpClient?
Hello.
I need to parse their HTML data. To do this, I use the following program code (C# .NET):
string pathToHtml = "ссылка";
WebClient client = new WebClient();
var data = client.DownloadData(pathToHtml);
var html = Encoding.UTF8.GetString(data);
// Создание экземпляра локальной переменной «doc».
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
// Загрузка HTML кода в локальную переменную «doc».
doc.LoadHtml(html);
var x = doc.DocumentNode.SelectNodes("XPATH выражение").Elements("tr").ToList();
WebClient
not supported in .NET Core: stackoverflow.com . For .NET Core, you need (maybe not required) to use HttpClient
. Answer the question
In order to leave comments, you need to log in
There are many ways, but I will suggest using the universal one, even if it is a crutch, but it does not take up much space, additional ones (third-party libraries are not needed ...):
using System;
using System.IO;
using System.Net;
using System.Net.Http;
using System.Text.RegularExpressions;
using System.Threading.Tasks;
namespace ConsoleApplication3
{
public static class Program
{
private static string html = "Ошибка";
private static void Main()
{
ShowTags("https://www.yandex.ru/","a");
Console.ReadKey();
}
private static async void ShowTags(string my_url, string tag = "a") // Тег по умолчанию для поиска, ищем теги <a></a>
{
// Загружем страницу
string data = await GetHtmlPageText(my_url);
if (!data.Contains("Ошибка"))
{
string pattern = string.Format(@"\<{0}.*?\>(?<tegData>.+?)\<\/{0}\>", tag.Trim());
// \<{0}.*?\> - открывающий тег
// \<\/{0}\> - закрывающий тег
// (?<tegData>.+?) - содержимое тега, записываем в группу tegData
Regex regex = new Regex(pattern, RegexOptions.ExplicitCapture);
MatchCollection matches = regex.Matches(data);
foreach (Match matche in matches)
{
Console.WriteLine(matche.Value);
Console.WriteLine("Содержание:");
Console.WriteLine(matche.Groups["tegData"].Value);
Console.WriteLine("---------------------------");
}
}
else
{
Console.WriteLine("Ошибка при загрузке со страницы: " + my_url);
}
}
private static async Task<string> GetHtmlPageText(string url)
{
await Task.Run(async()=>{
// ... используем HttpClient.
using (HttpClient client = new HttpClient())
using (HttpResponseMessage response = await client.GetAsync(url))
using (HttpContent content = response.Content)
{
// ... записать ответ
string result = await content.ReadAsStringAsync();
if (html != null)
{
html = result;
}
}
});
return html;
}
}
}
<a href="http://mail.yandex.ru"onclick="c(this,17,1080)">Войти в почту</a>
Содержание:
Войти в почту
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question