dsarttv037029 2017-09-08 07:51
浏览 74
已采纳

XPath没有返回元素之后的所有内容

I'm retrieving a ul > li of ingredients from my site, then I'm using foreach to loop through each li.

Inside the <li></li> it contains information in the following format: <strong>1-2 tablespoons</strong> <a href="link">coconut oil</a> (to taste), not all contains hyperlinks, it's random.

All I'm trying to do is break up the data so I can put them into an array like so:

array(
    0 => array(
        'amount' => 2 ounces,
        'ingredients' => pre-cooked chicken
    ),
    1 => array(
        'amount' => 1-2 tablespoons,
        'ingredients' => coconut oil (to taste)
    )
);

While maintaining the html a link in the coconut oil part.

Here is the code that I'm using.

$string is an array with the li content
foreach($string as $data){
    $try = new \DOMdocument;
    $try->loadHTML($data);
    $find = new \DOMXPath($try);

    // from this point it's where I'm having problems
    $x = $find->query('//li');
    foreach($x as $data){
        echo '<pre>', print_r($data), '</pre>';
    }
}

The print_r($data) returns the following DOMElement Objects (with other empty keys like parentNode, childNode, firstChild, previousSibling):

DOMElement Object (
    [tagName] => li
    [schemaTypeInfo] => 
    [nodeName] => li
    [nodeValue] => 2 ounces pre-cooked chicken
    [nodeType] => 1
    [attributes] => (object value omitted)
    [ownerDocument] => (object value omitted)
    [localName] => li
    [textContent] => 2 ounces pre-cooked chicken
)
DOMElement Object (
    [tagName] => li
    [schemaTypeInfo] => 
    [nodeName] => li
    [nodeValue] => 1-2 tablespoons coconut oil (to taste)
    [nodeType] => 1
    [attributes] => (object value omitted)
    [ownerDocument] => (object value omitted)
    [localName] => li
    [textContent] => 1-2 tablespoons coconut oil (to taste)
)

I thought it would be best to break up the information, in 1 query I just get all of the data inside the strong tag, but the issue that I'm having is actually just getting all of the content after the strong tag.

Here I try to get all of the content after the strong tag:

$list = $find->query('//strong/following-sibling::text()');
foreach($list as $data){
    $i[] = $try->saveHTML($data);
}

If I print_r($i) I get the following:

Array
(
    [0] =>  pre-cooked chicken
    [1] =>  
    [2] =>  (to taste)
)

but if I change the query to $list = $find->query('//strong/following-sibling::*') all I get is the following which is a hyperlink.

Array
(
    [0] => coconut oil
)

Update:

Input array:

Array (
    [0] => <strong>2 ounces</strong> pre-cooked chicken
    [1] => <strong>1-2 tablespoons</strong> <a href="/link">coconut oil</a> (to taste)
) 

And

Expected output:

array(
    0 => array(
        'amount' => 2 ounces,
        'ingredients' => pre-cooked chicken
    ),
    1 => array(
        'amount' => 1-2 tablespoons,
        'ingredients' => <a href="/link">coconut oil</a> (to taste)
    )
);

展开全部

  • 写回答

1条回答 默认 最新

  • doufei1893 2017-09-08 09:09
    关注

    Are you expecting something like this? Hope this seems to be helpful. Here we are using preg_match.

    Try this code snippet here

    <?php
    ini_set('display_errors', 1);
    $result=array();
    $array=Array (
        0 => "<strong>2 ounces</strong> pre-cooked chicken",
        1 => '<strong>1-2 tablespoons</strong> <a href="/link">coconut oil</a> (to taste)'
    );
    foreach($array as $data) 
    {
        preg_match("/<strong>(.*?)(?:<\/strong>)(.*)/",$data,$matches);
        $result[]=array(
            "amount"=>$matches[1],
            "ingredients"=>$matches[2]
        );
    }
    print_r($result);
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
编辑
预览

报告相同问题?

手机看
程序员都在用的中文IT技术交流社区

程序员都在用的中文IT技术交流社区

专业的中文 IT 技术社区,与千万技术人共成长

专业的中文 IT 技术社区,与千万技术人共成长

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

客服 返回
顶部