dsarttv037029 2017-09-08 15:51
浏览 74
已采纳

XPath没有返回元素之后的所有内容

I'm retrieving a ul > li of ingredients from my site, then I'm using foreach to loop through each li.

Inside the <li></li> it contains information in the following format: <strong>1-2 tablespoons</strong> <a href="link">coconut oil</a> (to taste), not all contains hyperlinks, it's random.

All I'm trying to do is break up the data so I can put them into an array like so:

array(
    0 => array(
        'amount' => 2 ounces,
        'ingredients' => pre-cooked chicken
    ),
    1 => array(
        'amount' => 1-2 tablespoons,
        'ingredients' => coconut oil (to taste)
    )
);

While maintaining the html a link in the coconut oil part.

Here is the code that I'm using.

$string is an array with the li content
foreach($string as $data){
    $try = new \DOMdocument;
    $try->loadHTML($data);
    $find = new \DOMXPath($try);

    // from this point it's where I'm having problems
    $x = $find->query('//li');
    foreach($x as $data){
        echo '<pre>', print_r($data), '</pre>';
    }
}

The print_r($data) returns the following DOMElement Objects (with other empty keys like parentNode, childNode, firstChild, previousSibling):

DOMElement Object (
    [tagName] => li
    [schemaTypeInfo] => 
    [nodeName] => li
    [nodeValue] => 2 ounces pre-cooked chicken
    [nodeType] => 1
    [attributes] => (object value omitted)
    [ownerDocument] => (object value omitted)
    [localName] => li
    [textContent] => 2 ounces pre-cooked chicken
)
DOMElement Object (
    [tagName] => li
    [schemaTypeInfo] => 
    [nodeName] => li
    [nodeValue] => 1-2 tablespoons coconut oil (to taste)
    [nodeType] => 1
    [attributes] => (object value omitted)
    [ownerDocument] => (object value omitted)
    [localName] => li
    [textContent] => 1-2 tablespoons coconut oil (to taste)
)

I thought it would be best to break up the information, in 1 query I just get all of the data inside the strong tag, but the issue that I'm having is actually just getting all of the content after the strong tag.

Here I try to get all of the content after the strong tag:

$list = $find->query('//strong/following-sibling::text()');
foreach($list as $data){
    $i[] = $try->saveHTML($data);
}

If I print_r($i) I get the following:

Array
(
    [0] =>  pre-cooked chicken
    [1] =>  
    [2] =>  (to taste)
)

but if I change the query to $list = $find->query('//strong/following-sibling::*') all I get is the following which is a hyperlink.

Array
(
    [0] => coconut oil
)

Update:

Input array:

Array (
    [0] => <strong>2 ounces</strong> pre-cooked chicken
    [1] => <strong>1-2 tablespoons</strong> <a href="/link">coconut oil</a> (to taste)
) 

And

Expected output:

array(
    0 => array(
        'amount' => 2 ounces,
        'ingredients' => pre-cooked chicken
    ),
    1 => array(
        'amount' => 1-2 tablespoons,
        'ingredients' => <a href="/link">coconut oil</a> (to taste)
    )
);
  • 写回答

1条回答 默认 最新

  • doufei1893 2017-09-08 17:09
    关注

    Are you expecting something like this? Hope this seems to be helpful. Here we are using preg_match.

    Try this code snippet here

    <?php
    ini_set('display_errors', 1);
    $result=array();
    $array=Array (
        0 => "<strong>2 ounces</strong> pre-cooked chicken",
        1 => '<strong>1-2 tablespoons</strong> <a href="/link">coconut oil</a> (to taste)'
    );
    foreach($array as $data) 
    {
        preg_match("/<strong>(.*?)(?:<\/strong>)(.*)/",$data,$matches);
        $result[]=array(
            "amount"=>$matches[1],
            "ingredients"=>$matches[2]
        );
    }
    print_r($result);
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 逻辑谓词和消解原理的运用
  • ¥15 三菱伺服电机按启动按钮有使能但不动作
  • ¥15 js,页面2返回页面1时定位进入的设备
  • ¥200 关于#c++#的问题,请各位专家解答!网站的邀请码
  • ¥50 导入文件到网吧的电脑并且在重启之后不会被恢复
  • ¥15 (希望可以解决问题)ma和mb文件无法正常打开,打开后是空白,但是有正常内存占用,但可以在打开Maya应用程序后打开场景ma和mb格式。
  • ¥20 ML307A在使用AT命令连接EMQX平台的MQTT时被拒绝
  • ¥20 腾讯企业邮箱邮件可以恢复么
  • ¥15 有人知道怎么将自己的迁移策略布到edgecloudsim上使用吗?
  • ¥15 错误 LNK2001 无法解析的外部符号