For循环只迭代一次（simplehtmldom）

I have a for loop that loops 3 times, and within the loop, a shell_exec() is done, calling a binary phantomjs and returning its output. This output is then passed into simplehtmldom's str_get_html()

Problem: When str_get_html($html) is involved in the for loop and $html consist of a webpage's HTML, only the first loop is executed, not the 2nd or 3rd. However, if I were to use some simple <a> tags for $html, the for loop iterates completely!

What is happening here, and how can I solve it?

Note the difference in the 2 functions below (the one that works and the one that loops only once) is how one of them have a line commented out, the other has another line commented out instead.

Parent function (The for loop here does not iterate completely)

public function action_asos() {


    // Site details
    $base_url = "http://www.mysite.com";

    // Category details
    $category_id = 7616;
    $per_page = 500;

    // Find number of pages in category
    $num_pages = 2;

    //THIS IS THE LOOP THAT CANNOT LOOP COMPLETELY!
    // Extract Product URLs from Category page
    for($i = 0; $i <= $num_pages; $i++) {
        echo "<h2>Page $i</h2>";
        $page = $i;
        $category_url = 'http://www.mysite.com/pgecategory.aspx?cid='.$category_id.'&parentID=-1&pge='.$page.'&pgeSize='.$per_page.'&sort=1';
        $this->extract_product_urls($category_url, $base_url);
    }
        echo "Yes.";
        flush();

}

PHP Code (causes loop in parent function to loop only once)

public function extract_product_urls($category_url, $base_url) {


    set_time_limit(300);
    include_once('/home/mysite/public_html/application/libraries/simple_html_dom.php');

    // Retrieve page HTML using PhantomJS
    $html = $this->get_html($category_url);

    // Extract links
    $html = str_get_html($html);
    //$html = str_get_html('<a class="productImageLink" href="asdasd"></a>');
    foreach($html->find('.productImageLink') as $match) {
        $product_url = $base_url . $match->href;
        $product_url = substr($product_url, 0, strpos($product_url, '&'));  // remove metadata in URL string
        $this->product_urls[] = $product_url;
    }

    echo "done.";
    flush();

}

Helper functions

/**
 * Gets the webpage's HTML (after AJAX contented has loaded, using PhantonJS)
 * @return [type] [description]
 */
public function get_html($url) {

    $url = escapeshellarg($url);    // prevent truncating after characters like `&`
    $script = path('base')."application/phantomjs/httpget.js";
    $output = shell_exec("phantomjs $script $url");

    return $output;

}

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
duanqinbi9029 2012-08-21 16:20
关注
Try this:

while($match = $html->find('.productImageLink')) { if (!is_object($match)) { break; } . . . }
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

For循环只迭代一次（simplehtmldom） php
2012-08-21 16:16

回答 1 已采纳 Try this: while($match = $html->find('.productImageLink')) { if (!is_object($match)) {
PHP for循环在第一次迭代后停止 php
2017-08-15 22:42

回答 2 已采纳 You are using the $i variable in both for loops. This overwrites the $i in the first loop. for ($
PHP For循环不迭代 php
2016-05-12 12:13

回答 3 已采纳 Ok I finally realised what I had done wrong. When I pushed another item in to the array see below
phpQuery 和 simple_html_dom对比
2018-04-18 17:17

GoverChan的博客 phpQuery和simple_html_dom都是非常优秀的DOM解析器。phpQuery主要使用方法，更多方法查看http://code.google.com/p/phpquery/1.加载文档的几种方式123456//$html为内容字符串，$contentType为文档类型，如果不指定...
循环只在函数内迭代一次 mysql php
2017-02-24 02:37

回答 2 已采纳 you over write $data in the loop you want to create new arrays (multidimensional): $data[] = arra
PHP：Simple DOM Parser如何迭代这个html代码 php
2018-04-20 06:22

回答 1 已采纳 As an alternative, since you're targeting that ID, you don't need to have a foreach on the parent
迭代器重置问题，如何通过for循环重置迭代器 eclipse java 有问必答
2022-01-05 16:52

回答 1 已采纳 if (iterator.next()> max) { max = iterator.next(); } 改为 if (next> max) { max = next; } i
phpQuery和simple_html_dom DOM解析器对比
2017-12-14 22:44

人间四月天美丽春色的博客 phpQuery和simple_html_dom都是非常优秀的DOM解析器。 phpQuery主要使用方法，更多方法查看http://code.google.com/p/phpquery/ 1.加载文档的几种方式 1 2 3 4 5 6 //...
在for循环里，怎么让一段代码只执行一次？有什么巧妙的方法吗?
2017-06-02 05:07

回答 19 已采纳 ArrayList list = new ArrayList(); boolean fg = true; for (String str : list) { if (str 满足某个条件 &&
PHP使用for循环来迭代MySQLi记录集 database php
2015-07-08 15:25

回答 2 已采纳 You can use LIMIT in the query. For example: SELECT * FROM tblNames ORDER BY numberID DESC LIMIT
MySQLi到PHP而在html表中循环只在if语句中迭代一次 php
2016-10-11 20:12

回答 1 已采纳 The actual error was the new variable in the while loop. The query's fetch variable name should ha
PHP数组foreach遍历输出例子详解
2021-08-10 14:30

叶涛网站推广优化的博客通常我们对于数据遍历会使用到foreach来操作当然也有使用到while list each函数来实现了，但在方便面上来看foreach更简洁好用性能也非常的不错，下面本人整理了一款在开发应用中foreach前后使用例子，希望对大家会...
当限制和增量是动态的时，如何获得for循环的最后一次迭代？ php
2016-04-15 09:17

回答 4 已采纳 You can just use, echo $i+$inc>$end?$i:"$i,"; It checks whether this is the last possible it
feachall php_XML DOM 循环（foreach）读取PHP数据和 PHP 编写 XML DOM 【转载】
2021-01-12 23:00

日签君AIUX的博客用 PHP 读取和编写可扩展标记语言(XML)看起来可能有点恐怖。实际上，XML 和它的所有相关技术可能是恐怖的，但是用 PHP 读取和编写 XML 不一定是项恐怖的任务。首先，需要学习一点关于 XML 的知识 —— 它是什么，用...
您如何在PHP中解析和处理HTML / XML？
2019-12-04 10:40

asdfgh0077的博客如何解析HTML / XML并从中提取信息？
没有解决我的问题, 去提问

悬赏问题

¥15 组策略中的计算机配置策略无法下发
¥15 机器学习简单问题解决
¥15 如何绘制动力学系统的相图
¥15 对接wps接口实现获取元数据
¥20 给自己本科IT专业毕业的妹m找个实习工作
¥15 用友U8：向一个无法连接的网络尝试了一个套接字操作，如何解决？
¥30 我的代码按理说完成了模型的搭建、训练、验证测试等工作(标签-网络|关键词-变化检测)
¥50 mac mini外接显示器画质字体模糊
¥15 TLS1.2协议通信解密
¥40 图书信息管理系统程序编写

For循环只迭代一次（simplehtmldom）

1条回答 默认 最新

悬赏问题

1条回答默认最新