Answer the question
In order to leave comments, you need to log in
How to remove encoding bug through webbrowser C#?
Good afternoon, comrades. Faced such a glitch. I am parsing a JSON file from the server. I just connect the webbrowser element to the desired page, take the data into the stream, cut out only a 9mb JSON piece and work with it further. There was a problem with the capital Russian letter R. It just breaks. "�" instead.
Perhaps this happens with other capital letters as well.
At the same time, if you save the JSON file manually through the Opera browser, it is saved normally. I read somewhere that this is due to the peculiarities of the utf-8 encoding. Is it possible to fix this bug without changing the webbrowser library? It is clear that I can write a fix and modify the data after receiving it, but I really would not want to do this.
I call the thread in the code:
Thread tr = new Thread(GetDoc);
tr.SetApartmentState(ApartmentState.STA);
tr.Start();
Thread.Sleep(20000);
tr.Abort();
//// Ожидание прерывания
tr.Join();
static void GetDoc()
{
web = new WebBrowser();
web.DocumentCompleted += web_DocumentCompleted;
web.Navigate("тут Сайт");
Application.Run();
}
//Загрузка JSON и сохранение в файл source.json
static void web_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
File.Delete("source.json");
FileInfo MyFile = new FileInfo("source.json");
FileStream fs = MyFile.Create();
fs.Close();
FileStream fileStream = new FileStream("source.json", FileMode.Open);
StreamWriter streamWriter = new StreamWriter(fileStream);
streamWriter.BaseStream.Seek(fileStream.Length, SeekOrigin.End);//запись в конец файла
Encoding encoding = Encoding.GetEncoding("utf-8");
//Encoding encoding = Encoding.GetEncoding(web.Document.Encoding);
string temp = null;
Stream stream = web.DocumentStream;
StreamReader sr = new StreamReader(stream, encoding);
temp = sr.ReadToEnd();
stream.Close();
//string temp = web.DocumentText;
//Образка лишнего кода
Regex regex1 = new Regex("<BODY><PRE>");
Regex regex2 = new Regex("</PRE></BODY></HTML>");
Match m1 = regex1.Match(temp);
Match m2 = regex2.Match(temp);
temp = temp.Substring(m1.Index + 11, m2.Index - 11 - m1.Index);
streamWriter.Write(temp);
streamWriter.Close();
fileStream.Close();
downJSONok = true;
Thread.CurrentThread.Abort();
//MessageBox.Show("Download complete. Press OK to continue.", "Done", MessageBoxButtons.OK, MessageBoxIcon.Asterisk);
//Environment.Exit(0);
}
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question