douji0073 2011-02-11 08:30
浏览 110
已采纳

curl_exec和utf-8?

hey guys, a german weather website provides a weather widget for owners of websites. This widget works fine with german Umlaute like äöü. However this widget is badly designed and so I'm using curl and xpath to query the information this weather widget provides. The weather widget is a set of tables and divs with inline styles and I'm using xpath to just get the values inside of the table td's.

Everything works fine except german Umlaute like äöü. My website is using utf-8 encoding and so all those Umlaute should work correctly (and they do on the rest of the page). Even when i place the weather widget normally on my website the widget works with those Umlaute.

However as soon as I use curl to get the values inside of the table the Umlaute don't work and get converted into weird characters.

<?php
$url = 'http://www.weatherxyz.com/hptool/wordpress_v1.php?cid=43Xv1a0&l=de';

$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_HEADER, false);

$str = curl_exec($curl);

$dom = new DOMDocument;
$dom->loadHTML($str);
$xpath = new DOMXPath($dom);

$tds = $xpath->query('//div/table/tr/td');
foreach ($tds as $key => $cell) {
        echo $cell->textContent;
}
?>

Have you guys any idea how i can make this work?

  • 写回答

2条回答 默认 最新

  • douyi6818 2011-02-11 09:09
    关注

    Looks like you're not alone in griping about DOMDocument not understanding different encodings. The specific poster includes SmartDOMDocument to undo some of its poor implementation.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 算能科技20240506咨询(拒绝大模型回答)
  • ¥15 自适应 AR 模型 参数估计Matlab程序
  • ¥100 角动量包络面如何用MATLAB绘制
  • ¥15 merge函数占用内存过大
  • ¥15 Revit2020下载问题
  • ¥15 使用EMD去噪处理RML2016数据集时候的原理
  • ¥15 神经网络预测均方误差很小 但是图像上看着差别太大
  • ¥15 单片机无法进入HAL_TIM_PWM_PulseFinishedCallback回调函数
  • ¥15 Oracle中如何从clob类型截取特定字符串后面的字符
  • ¥15 想通过pywinauto自动电机应用程序按钮,但是找不到应用程序按钮信息