douzhantao2857 2017-04-26 13:02
浏览 165

页面卷曲与俄语的

I m getting encoding problem when doing curl using php of this page that is in russian language https://web.archive.org/web/20060403041216/http://inostranets.ru:80/

Here below the code that I m using :

$url="https://web.archive.org/web/20060403041216/http://inostranets.ru:80/";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_TIMEOUT, 30);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);         
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch,CURLOPT_USERAGENT,'waybackmachinedownloader');
$html = curl_exec($ch);

As result I m getting caracters similar to this: "ÂÍÅ ÊÎÍÊÓÐÅÍÖÈÈ – ÑÊÀÇÎ×ÍÛÉ ÑÈÍÃÀÏÓÐ Òóðîïåðàòîð «ÄÅλ ïðèãëàøàåò Âàñ ïîñåòèòü"

Please check image below

enter image description here

  • 写回答

2条回答 默认 最新

  • dsfovbm931034814 2017-04-26 13:20
    关注

    The page you're trying to parse is windows-1251 encoded. To tell the browser you're outputting windows-1251, you can use:

    header('Content-Type: text/html; charset=windows-1251'); ,

    i.e.:

    $url="https://web.archive.org/web/20060403041216/http://inostranets.ru:80/";
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_TIMEOUT, 30);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch,CURLOPT_USERAGENT,'waybackmachinedownloader');
    $html = curl_exec($ch);
    
    header('Content-Type: text/html; charset=windows-1251');
    print $html;
    

    Update:

    To save the $html to a file use:

    file_put_contents("curl_russian.html", $html);
    

    Note:

    When you open the html file, make sure you select Text Encoding to Cyrillic Windows on your browser.


    enter image description here

    评论

报告相同问题?

悬赏问题

  • ¥15 AT89C51控制8位八段数码管显示时钟。
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 下图接收小电路,谁知道原理
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度
  • ¥30 关于#r语言#的问题:如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
  • ¥15 ETLCloud 处理json多层级问题
  • ¥15 matlab中使用gurobi时报错
  • ¥15 这个主板怎么能扩出一两个sata口