Answer the question
In order to leave comments, you need to log in
Why do I get wrong encoding when parsing html?
I'm parsing the page https://classinform.ru/fkko-2017.html.
In the browser, everything is in order, when copying by hand, it is also perfectly copied. When I do UrlFetchApp.fetch()
, Cyrillic turns into �, while encoding is utf-8.
Request parameters
var options = {
"method": "get",
"headers": {},
}
Answer the question
In order to leave comments, you need to log in
Usually, you always need to specify the encoding when fetching. But it so happened that everyone is used to UTF-8.
Specify the encoding of your content when extracting
const data = UrlFetchApp.fetch('https://classinform.ru/fkko-2017.html.');
console.log(data.getContentText('windows-1251'));
There's a page in cp1251. This encoding is specified in a special tag on the page:
<meta http-equiv="content-type" content="text/html; charset=windows-1251">
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question