duananyu9231 2012-12-11 15:11
浏览 51
已采纳

如何在执行后从xpath节点中删除空间?

Sorry, if I posted a weird question. Now, I am explaining it properly.

I am trying to scrap a data from a web page say for example from: http://www.koolkart.com/nokia-lumia-800-p8251

I am trying to get price of the device listed in the table near "saholic" it is "Rs. 17759/-" and xpath to it is "//*[@id="price-chart"]/tbody/tr[1]/td[3]"

Now, when I execute the xpath in chrome, it gives the price with space

"

                                Rs. 17759/-
                                "

Now, when I execute it in php

<?php
$xpath = '//*[@id="price-chart"]/tbody/tr[1]/td[3]';          
$html = new DOMDocument();
@$html->loadHTMLFile('http://www.koolkart.com/nokia-lumia-800-p8251');
$xml = simplexml_import_dom($html);   
if (!$xml) {
    echo 'Error while parsing the document';
    exit;
}
echo $xml;
$source_image = $xml->xpath($xpath);
print_r($source_image);
?>

it gives error !

So, is there any solution how to remove those trailing spaces or any other way how to get that price?

  • 写回答

1条回答 默认 最新

  • douluan8828 2012-12-11 16:09
    关注

    I will assume you want the price then.

    You don't get the same results in the browser and your script because the tbody tag is added by your browser, but isn't part of the source.

    You can use an expression like this :

    '//*[@id="price-chart"]/tr[1]/td[3]'
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 yolov8边框坐标
  • ¥15 matlab中使用gurobi时报错
  • ¥15 这个主板怎么能扩出一两个sata口
  • ¥15 不是,这到底错哪儿了😭
  • ¥15 2020长安杯与连接网探
  • ¥15 关于#matlab#的问题:在模糊控制器中选出线路信息,在simulink中根据线路信息生成速度时间目标曲线(初速度为20m/s,15秒后减为0的速度时间图像)我想问线路信息是什么
  • ¥15 banner广告展示设置多少时间不怎么会消耗用户价值
  • ¥16 mybatis的代理对象无法通过@Autowired装填
  • ¥15 可见光定位matlab仿真
  • ¥15 arduino 四自由度机械臂