dongyao2129 2018-06-09 09:25
浏览 44
已采纳

在PHP中存储兄弟元素的属性和内部html

I'm trying to search and store values from an html page so I have a simple array of arrays. It will only have 2 arrays, each being 3 items long. I define it like so; these are just the headers:

$fileContents = array(
    array('Date', 'Title', 'Link')
);

The html has the following structure:

<li class='my-list'>
    <div class='my-meta'>
        <span class='my-date'>06/08/2018</span>
    </div>
    <a href='https://www.example.com/'>My Title </a>

This structure repeats a few times. I only need the first one from the top (the latest one). I can see that all the information I need or my array is there. Date is 06/08/2018, Title is My Title, and Link is www.example.com/. But I don't know how I can access them; particularly the Title and Link, because there are no classes on those elements. Just to clarify further, I want this as an end result (it's a csv):

Date, Title, Link
06/08/2018, My Title, https://www.example.com/

I am using the following approach at the moment. The only one I know how to get is the Date:

$dateClassName="my-date";

$xpath = new DomXpath($doc);
$dateList = $xpath->query("//span[contains(@class, '$dateClassName')]");
$dateNode = $dateList->item(0);

function innerHTML($node) {
    return implode(array_map([$node->ownerDocument, "saveHTML"],
            iterator_to_array($node->childNodes)));
}

$textArray = array();
array_push($textArray, innerHTML($dateNode));

The remaining items (Link, and Title) I'm not sure how to store, because there are no classes on the elements.

Question: Given my existing approach above, what more can I do to store the values I need from the HTML if the elements in question do not have an overt class to search by? Can I somehow get them by virtue of their relative sibling positions?

  • 写回答

1条回答 默认 最新

  • douliao8318 2018-06-09 09:37
    关注

    Here's a simple code that gets all you need:

    $s = "<ul>
        <li class='my-list'>
            <div class='my-meta'>
                <span class='my-date'>06/08/2018</span>
            </div>
            <a href='https://www.example.com/'>My Title </a>
        </li>
        <li class='my-list'>
            <div class='my-meta'>
                <span class='my-date'>06/08/2017</span>
            </div>
            <a href='https://www.example.com/2'>My Title2 </a>
        </li>
    </ul>";
    
    $doc = new DOMDocument();
    $doc->loadHTML($s);
    $xpath = new DomXpath($doc);
    $li = $xpath->query("//li");
    $li = $li->item(0);
    var_dump($li->getElementsByTagName('a')[0]->getAttribute('href'));
    var_dump($li->getElementsByTagName('div')[0]->getElementsByTagName('span')[0]->textContent);
    var_dump($li->getElementsByTagName('a')[0]->textContent);
    

    As you see, you can work with $li as it is object of type DOMElement.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 安卓adb backup备份应用数据失败
  • ¥15 eclipse运行项目时遇到的问题
  • ¥15 关于#c##的问题:最近需要用CAT工具Trados进行一些开发
  • ¥15 南大pa1 小游戏没有界面,并且报了如下错误,尝试过换显卡驱动,但是好像不行
  • ¥15 没有证书,nginx怎么反向代理到只能接受https的公网网站
  • ¥50 成都蓉城足球俱乐部小程序抢票
  • ¥15 yolov7训练自己的数据集
  • ¥15 esp8266与51单片机连接问题(标签-单片机|关键词-串口)(相关搜索:51单片机|单片机|测试代码)
  • ¥15 电力市场出清matlab yalmip kkt 双层优化问题
  • ¥30 ros小车路径规划实现不了,如何解决?(操作系统-ubuntu)