Answer the question
In order to leave comments, you need to log in
Is it possible to download an article from wikipedia using c#?
Hello everyone, there is a music site written on asp.net, you need to make it so that the artist's page displays information about him from Wikipedia. Actually the question is how to write a function that will pull text from a Wikipedia page. Thanks in advance!
Answer the question
In order to leave comments, you need to log in
https://www.mediawiki.org/wiki/API:Main_page/ru
https://www.google.ru/search?q=c%20sharp%20http%20...
Definitely possible.
-Pure JS on the client, he will do everything himself. (Googling how to send a GET request to another site, get and parse the response, taking out only the necessary one, if any)
-APS.NET is done on the server and immediately given to the client. (we think whether it is necessary to do this by the server)
I could not figure out the API, but I found a solution to the problem in the form of parsing.
Used HtmlAgilityPack to extract article text from page. I throw the code in which I tested this parsing, it can be useful to someone:
public static void GetArticle() //
{
string html = "https://en.wikipedia.org/wiki/Gorillaz ";
HtmlDocument HD = new HtmlDocument();
var web = new HtmlWeb
{
AutoDetectEncoding = false,
OverrideEncoding = Encoding.UTF8,
};
HD = web.Load(html); //Скачиваем всю HTML страницу
HtmlNodeCollection NoAltElements;
NoAltElements = HD.DocumentNode.SelectNodes("//div[@class='mw-content-ltr']/p"); //Из элемента с классом 'mw-content-ltr'
//Берём весь текст,
//который находится в тэге <p>
string outputText = "";
// проверка на наличие найденных узлов
if (NoAltElements != null)
{
foreach (HtmlNode HN in NoAltElements)
{
//Получаем строчки
outputText = HN.InnerText;
}
}
Console.WriteLine(outputText);
}
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question