dowbwrr3590709 2012-10-17 20:52
浏览 43
已采纳

页面未转换为xml格式

I am grabbing a page and then converting it into an xml format, the function im using is below

public function getXML($url){
   $ch = curl_init();
   //curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
   //curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
   curl_setopt($ch, CURLOPT_URL,$url);
   curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
   $response = curl_exec($ch);      
   $xml = simplexml_load_string($response);
   return $xml;
}

print_r($curl->getXML("http://www.amazon.co.uk/gp/offer-listing/0292783760/ref=tmm_pap_new_olp_sr?ie=UTF8&condition=used"));

After trying different urls nothing is returned, the page loads fine so the problem is with the line $xml = simplexml_load_string($response);

What could be wrong with this code?

  • 写回答

1条回答 默认 最新

  • dtr87341 2012-10-17 21:11
    关注

    Not understanding exactly what you're up to, it looks like you're trying to scrape the Amazon web page? If I pull up that URL in my browser, it's not listed as XHTML in the headers or document itself--I suspect it's not. I don't think simplexml can handle that.

    (Does CURL do the conversion to XML for you? I don't think so but I'm not a master of all things CURL. If so, it might be an incompatability between CURL's output and what simplxml--which is fairly limited--will take in).

    You might try working with DOMDocument instead, although my PHP could be a bit out of date--there may be better utilities these days.

    A quick googling brought up this tutorial

    <?php
      $doc = new DOMDocument();
      $doc->strictErrorChecking = FALSE;
      $doc->loadHTML($html);
      $xml = simplexml_import_dom($doc);
    ?>
    

    I don't think this is a complete answer, but it was a bit much for a comment; so take it with a grain of salt and a healthy serving of doubt. I hope it inspires some ideas.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 微带串馈天线阵列每个阵元宽度计算
  • ¥15 关于无人驾驶的航向角
  • ¥15 keil的map文件中Image component sizes各项意思
  • ¥30 BC260Y用MQTT向阿里云发布主题消息一直错误
  • ¥20 求个正点原子stm32f407开发版的贪吃蛇游戏
  • ¥15 划分vlan后,链路不通了?
  • ¥20 求各位懂行的人,注册表能不能看到usb使用得具体信息,干了什么,传输了什么数据
  • ¥15 Vue3 大型图片数据拖动排序
  • ¥15 Centos / PETGEM
  • ¥15 划分vlan后不通了