PHP DomXPath - 按类获取子项

到目前为止,我的代码使用xPath查询获取所有类的“forumRow”。 如何获得每个'forumRow'类中存在一次的a元素的href属性?</ p>

我有点卡在我可以运行查询的位置 第一个查询的结果。</ p>

我当前的代码</ p>

  $ this  - &gt;  boards = array(); 
$ html = @file_get_contents('http://www.roblox.com/Forum/Default.aspx');

libxml_use_internal_errors(true);
$ page = new DOMDocument( );
$ page - &gt; preserveWhiteSpace = false;
$ page - &gt; loadHTML($ html);

$ xpath = new DomXPath($ page);
$ board_array = $ xpath - &gt; query('// * [@ class =“forumRow”]');

foreach($ board_array as $ board)
{
$ childNodes = $ board - &gt; childNodes;
$ boardName = $ childNodes - &gt; 项目(0) - &gt; nodeValue;

if(strlen($ boardName)&gt; 0)
{

$ boardDesc = $ childNodes - &gt; 项目(1) - &gt; nodeValue;
array_push($ this - &gt; boards,array($ boardName,$ boardDesc));
}
}
$ Cache - &gt; saveData(json_encode($ this - &gt; boards));
</ code> </ pre>
</ div>

展开原文

原文

So far, my code is getting all classes 'forumRow' using a xPath query. How would I get the href-attribute of the a-element which exists once in every 'forumRow' class?

I'm kinda stuck at the point where I can run a query starting from the result of the first query.

My current code

            $this -> boards = array();
            $html = @file_get_contents('http://www.roblox.com/Forum/Default.aspx');

            libxml_use_internal_errors(true);
            $page = new DOMDocument();
            $page -> preserveWhiteSpace = false;
            $page -> loadHTML($html);

            $xpath = new DomXPath($page);
            $board_array = $xpath -> query('//*[@class="forumRow"]');

            foreach($board_array as $board)
            {
                $childNodes = $board -> childNodes;
                $boardName = $childNodes -> item(0) -> nodeValue;

                if (strlen($boardName) > 0)
                {

                    $boardDesc = $childNodes -> item(1) -> nodeValue;
                    array_push($this -> boards, array($boardName, $boardDesc));
                }
            }
            $Cache -> saveData(json_encode($this -> boards));

2个回答



不幸的是,我不能让你的代码工作(关于forumRow <代码>&LT的提取物; TD&GT; </代码> () - 所以我改为:</ p>

  $ html = @file_get_contents('http://www.roblox.com/Forum/Default.aspx');

libxml_use_internal_errors(true);
$ page = new DOMDocument();
$ page-&gt; preserveWhiteSpace = false;
$ page-&gt; loadHTML($ html);
$ xpath = new DomXPath($ page );

foreach($ xpath-&gt; query('// td [@ class =“forumRow”]')as $ element){
$ links = $ element-&gt; getElementsByTagName('a');

foreach($ links as $ a){
echo $ a-&gt; getAttribute('href')。'&lt; br&gt;';
}
}
</ code> </ pre> \ n

生成</ p>


/Forum/Search/default.aspx

/Forum/ShowForum.aspx?ForumID=46
\ n /Forum/ShowForum.aspx?ForumID=14

/Forum/ShowForum.aspx?ForumID=44

/Forum/ShowForum.aspx?ForumID=43

/ Forum / ShowForum .aspx?ForumID = 45

/Forum/ShowForum.aspx?ForumID=21

/Forum/ShowForum.aspx? ForumID = 13

...

a很长的列表</ p>
</ blockquote>

来自&lt; td class =“forumRow的所有href “&gt; ..&lt; a href = ...&gt;&lt; / a&gt; ..&lt; / td&gt; </ code> </ p>
</ div>

展开原文

原文

Sad to say, I could not get your code to work (regarding extract of forumRow <td>'s) - so I made this up instead :

$html = @file_get_contents('http://www.roblox.com/Forum/Default.aspx');
libxml_use_internal_errors(true);
$page = new DOMDocument();
$page->preserveWhiteSpace = false;
$page->loadHTML($html);
$xpath = new DomXPath($page);

foreach($xpath->query('//td[@class="forumRow"]') as $element){
    $links=$element->getElementsByTagName('a');
    foreach($links as $a) {
        echo $a->getAttribute('href').'<br>';
    }
}

produces

/Forum/Search/default.aspx
/Forum/ShowForum.aspx?ForumID=46
/Forum/ShowForum.aspx?ForumID=14
/Forum/ShowForum.aspx?ForumID=44
/Forum/ShowForum.aspx?ForumID=43
/Forum/ShowForum.aspx?ForumID=45
/Forum/ShowForum.aspx?ForumID=21
/Forum/ShowForum.aspx?ForumID=13
...
a very long list

All the hrefs from <td class="forumRow">..<a href= ... ></a>..</td>



在函数中间有一个 return </ code>,因此数组永远不会被填充,也不会< 代码> saveData(...)</ code>被调用。 只需删除此行,您的代码似乎可以正常工作。 ;)</ p>

  $ childNodes = $ board  - &gt; 的childNodes; 
return; //&lt; - 删除此行
$ boardName = $ childNodes - &gt; 项目(0) - &gt; nodeValue;
</ code> </ pre>
</ div>

展开原文

原文

There is a return right in the middle of your function, so the array is never filled, nor saveData(...) gets called. Just remove this line and your code seems to work. ;)

$childNodes = $board -> childNodes;
return; // <-- remove this line
$boardName = $childNodes -> item(0) -> nodeValue;

Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问
相关内容推荐