A
A
Andrew2019-02-20 11:26:58
PHP
Andrew, 2019-02-20 11:26:58

How to deal with php encoding?

Greetings!
Can you please tell me how to fix the encoding of the received text?
On request

$content = file_get_contents('http://vk.com/foaf.php?id=1');

i need a line from the <foaf:name></foaf:name>
code itself looks like
So:
<?
$content = file_get_contents('http://vk.com/foaf.php?id=1');
preg_match_all('#<foaf:name>(.+?)</foaf:name>#is', $content, $arr);
print_r($arr[1]);
?>

or
So:
<?
$content = file_get_contents('http://vk.com/foaf.php?id=529113');
$pos = strpos($content, '<foaf:name>');
$content = substr($content, $pos);
$pos = strpos($content, '</foaf:name>');
$content = substr($content, 0, $pos);
$content = str_replace('текст который нужно вырезать','', $content);
//$content = iconv("utf-8","windows-1251",$content); //Смена кодировки
print $content;
?>


But at the output I get question marks instead of Cyrillic. Latin is displayed as it should.
I can't understand the reason. Rather, it is clear that the problem is in the encoding. But this issue cannot be resolved.
What I tried:
Changed encoding from utf-8 to windows-1251. Registered in the source file
<meta http-equiv="Content-Type" content="text/html; charset=windows-1251">.
, indicated in the code
<?header("Content-type:text/html; charset=windows-1251");?>
. Changed the encoding of the file itself. In the code I tried to change the encoding with the capabilities of php itself
$content1 = iconv("utf-8","windows-1251",$content); //Смена кодировки

Created .htacces. He wrote AddDefaultCharset windows-1251and PHP_VALUE default_charset windows-1251
But the result is the same, unfortunate signs ������ ������...

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Anton, 2019-02-20
@Tekcry

$content = iconv("windows-1251", "utf-8",$content); //Смена кодировки

string iconv ( string in_charset, string out_charset, string str )
Converts the character encoding of the string str from the initial encoding of in_charset to the final encoding of out_charset. Returns the string in the new encoding, or FALSE on error.

1st parameter - FROM WHAT ENCODING
2nd - TO WHAT
you are trying to get from the answer, which on win-1251 pull out UTF and convert to win1251, in other words, you mixed up the encodings, from which to which

L
Lander, 2019-02-20
@usdglander

preg_match_all('#<foaf:name>(.+?)</foaf:name>#isu', $content, $arr);

Or, if through positions
$pos = mb_strpos($content, '<foaf:name>');

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question