Answer the question
In order to leave comments, you need to log in
How to quickly parse links on the portal?
You need to find broken links on the portal.
I download the html page, find all the links on it and add them to the download queue. In the process, I highlight broken / incorrect.
HttpWebResponse response;
StreamReader respStream;
try
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(link.ToString());
request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; rv:11.0) Gecko/20100101 Firefox/15.0";
request.Accept = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
request.AllowAutoRedirect = true;
request.Timeout = 10000;
response = (HttpWebResponse)request.GetResponse();
respStream = new StreamReader(response.GetResponseStream());
html = respStream.ReadToEnd();
response.Close();
respStream.Close();
}
catch (Exception ex)
{
System.Console.WriteLine("-------------\n" +
"Bad link: " + link + "\n" +
"From: " + link.Parent +
"\n" + ex.Message);
link.ErrorComments = ex.Message;
link.Parent.AddSon(link);
continue;
}
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question