dongye1912 2014-07-15 00:43
浏览 99

使用php DOMXPath获取选项内容

I'm trying to get product sizes from all option elements or particular select tag that has an:

<select id="prodSize" name="prodSize">
    <option value="9274">10D</option>
    <option value="9275">10DD</option>
    <option value="9276">10E</option>
    <option value="9277">10F</option>
    <option value="9279">10G</option>
    <option value="9288">12D</option>
    <option value="9289">12DD</option>
    <option value="9290">12E</option>
    <option value="9291">12F</option>
    <option value="9301">14D</option>
    <option value="9302">14DD</option>
    <option value="9303">14E</option>
    <option value="9304">14F</option>
    <option value="9305">14FF</option>
    <option value="9315">16D</option>
    <option value="9317">16E</option>
    <option value="9318">16F</option>
    <option value="9319">16FF</option>
    <option value="9320">16G</option>
</select>

I have tried using $x("//select[@id='prodSize']/option/text()") in chrome dev tools and it returned me all the values with no problem, but when I'm tying to get it with DOMXPath:

$options = $xpath->query("//select[@id='prodSize']/option/text()");

or:

$options = $xpath->query("*/select[@id='prodSize']/option");

I get:

object(DOMNodeList)#40 (1) { ["length"]=> int(0) } object(DOMNodeList)#29 (1) { ["length"]=> int(0) }
object(DOMNodeList)#39 (1) { ["length"]=> int(0) } object(DOMNodeList)#41 (1) { ["length"]=> int(0) }

I have added full code for clarity:

scrapCatUrl('http://.../shop-management/categories/maternity-lingerie.aspx',  "//ul[@class='lvl2 visible']/li/a/@href");

 function scrapCatUrl($path, $query){

    $xpath = scrap($path);
    $links = $xpath->query($query);
    foreach($links as $link){
        echo 'Category'.' - '.$url.$link->nodeValue . '<br>';
        scrapProdUrl($url.$link->nodeValue);
    }
}
 function scrapProdUrl($path){

    $xpath = scrap($path);
    $links = $xpath->query("//a[@class='thumbObj']/@href");
    $i = 0;
    foreach($links as $link){
        echo 'Product'.' - '.$url.$link->nodeValue . '<br>';
        getProdData($url.$link->nodeValue);
        if($i > 2){
            die();
        }
        $i++;
    }
}
function getProdData($path){
    $xpath = scrap($path);
    $description = $xpath->query("//meta[@name='description']/@content");
    $keywords = $xpath->query("//meta[@name='keywords']/@content");
    $title = $xpath->query("//h4[@class='h4-productdetail']/text()");
    $price = $xpath->query("//div[@class='productDetail']/span[@class='price']/text()");
    $images = $xpath->query("//div[@class='imgs']/img/@src");
    $fullDescription = $xpath->query("//div[@class='flash']/following-sibling::div[@class='clearer']/preceding-sibling::text()[preceding-sibling::div[@class='flash']]");
    $options = $xpath->query("//select[@id='prodSize']/option/text()");

    echo 'Meta Description'.' - '.$description->item(0)->nodeValue. '<br>';
    echo 'Meta Keywords'.' - '.$keywords->item(0)->nodeValue. '<br>';
    echo 'Title'.' - '.$title->item(0)->nodeValue. '<br>';
    echo 'Price'.' - '.$price->item(0)->nodeValue. '<br>';
    if($images->length > 1){
        foreach($images as $image){
            echo '<img src="'.$url.$image->nodeValue.'" />'. '<br>';
        }
    }
    else{
        echo '<img src="'.$url.$image->nodeValue.'" />'. '<br>';
    }
    foreach($options as $option){
        echo $option->nodeValue;
    }



}
 function scrap($path){
    $ch = curl_init($path);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    $page = curl_exec($ch);
    $dom = new DOMDocument();
    @$dom->loadHTML($page);
    $xpath = new DOMXpath($dom);
    return $xpath;
}

I have tried few ways that people suggested here but getting the same result. I have no problems getting any other element from a page, title, images, descriptions everything except this one.

  • 写回答

0条回答

    报告相同问题?

    悬赏问题

    • ¥20 求数据集和代码#有偿答复
    • ¥15 关于下拉菜单选项关联的问题
    • ¥15 如何修改pca中的feature函数
    • ¥20 java-OJ-健康体检
    • ¥15 rs485的上拉下拉,不会对a-b<-200mv有影响吗,就是接受时,对判断逻辑0有影响吗
    • ¥15 使用phpstudy在云服务器上搭建个人网站
    • ¥15 应该如何判断含间隙的曲柄摇杆机构,轴与轴承是否发生了碰撞?
    • ¥15 vue3+express部署到nginx
    • ¥20 搭建pt1000三线制高精度测温电路
    • ¥15 使用Jdk8自带的算法,和Jdk11自带的加密结果会一样吗,不一样的话有什么解决方案,Jdk不能升级的情况