Answer the question
In order to leave comments, you need to log in
Why can't I get the desired page from the C# Internet?
Hello everybody!
I'm new to C#, trying to parse web pages I need. I write the parser in Visual Studio 2015. I ran into this problem: when I try to get the page I need through a proxy, I get the following text instead of the page:
package ru.sbogomolov.template;
public class servletBase extends HttpServleterror: java.lang.NullPointerException
And only when trying to get only a certain page class. Any other pages of the site are fine. Could this be protection against automatic page reading? And how can this be bypassed? And I checked, the page I requested is on the site.
Used code:
HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://id.npfte.ru/identDeclDrug/main?act=submit&countryE=countryE_RU&codeCD=codeCD_%D0%A4%D0%9C01&CodeSybbol=CodeSybbol_%D0%94&shortNum=54904&DateTypeSearch=beginDate&day=day_26&mount=mount_11&year=year_2014");
WebProxy myproxy = new WebProxy(textBox1.Text, Convert.ToInt32(textBox2.Text));
req.Proxy = myproxy;
req.Timeout = 50000; //установили таймаут (ожидаем 30 секунд ответа на запрос)
try
{
System.Diagnostics.Stopwatch swatch = new System.Diagnostics.Stopwatch(); // создаем объект
swatch.Start(); // старт замера времени
//из ответа получаем входной поток
HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
swatch.Stop(); // стоп замера времени
MessageBoxButton buttOk = MessageBoxButton.OK;
MessageBox.Show("Время вычисления = " + swatch.ElapsedMilliseconds / 1000.0, "Уведомление", buttOk);
StreamReader istrm = new StreamReader(resp.GetResponseStream(), Encoding.GetEncoding(1251));
//resp.Close();
for (int i = 1; ; i++)
{
ch = istrm.Read();
if (ch == -1) break;
if (ch == 10) //перенос строки
{
textBox.Text = textBox.Text + textBuffer + Environment.NewLine;
textBuffer = "";
}
textBuffer = textBuffer + (char)ch;
}
textBox.Text = textBox.Text + textBuffer;
MessageBox.Show("Конец!", "Уведомление", buttOk);
//закрываем поток, содержащий ответ. При этом автоматически закроется и входной поток istrm
resp.Close();
}
catch (System.Net.WebException ex)
{
MessageBoxButton buttOk = MessageBoxButton.OK; //Не удалось подключиться к прокси! :(
MessageBox.Show(ex.Message + textBox1.Text + Convert.ToInt32(textBox2.Text), "Ошибка", buttOk);
}
}
Answer the question
In order to leave comments, you need to log in
> Could this be protection against automatic reading of pages? And how can this be bypassed?
Maybe. You can get around it if you use a sniffer (for example, Fiddler) to look at requests from the browser and imitate them 100% in C # (take into account all headers, cookies, etc.)
You can also experiment with headers and cookies in Fiddler itself (Composer tab)
Perhaps there are not enough headers and therefore he does not want to build the page, try adding headers like this:
Accept:*/*
Accept-Encoding:gzip, deflate, sdch
Accept-Language:ru-RU,ru;q=0.8,en-US; q =
0.6 ,
en ;q
=
0.4 ; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36
It looks like the error is on the server itself. Post a link to the page that is giving the error.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question