douhuitan5863 2010-04-29 22:20
浏览 77
已采纳

像sed / awk / grep一样帮助PHP数据修改

Ok guys.. I have a HTML i need to parse into a php script and mangle the data around abit. For best explanation I will show how I would do this in a bash script using awk, grep, egrep, and sed through a god awful set of pipes. Commented for clarity.

curl -s http://myhost.net/mysite/           | \ # retr the document 
       awk '/\/\action/,/submit/'           | \ # Extract only the form element
       egrep -v "delete|submit"             | \ # Remove the action lines 
       sed 's/^[ \t]*//;s/[ \t]*$//'        | \ # Trim extra whitespaces etc. 
       sed -n -e ":a" -e "$ s/
//gp;N;b a" | \ # Remove every line break
       sed '{s|<br />|<br />
|g}'          | \ # Insert new line breaks after <br />
       grep "onemyndseye@localhost"         | \ # Get lines containing my local email
       sed  '{s/\[[^|]*\]//g}'              | \ # Remove my email from the line

These commands take the form element that looks like this:

<form action="/action" method="post">
    <input type="checkbox" id="D1" name="D1" /><a href="http://www.linux.com/rss/feeds.php">
        http://www.linux.com/rss/feeds.php
    </a> [email: 
        onemyndseye@localhost (Default)
    ]<br />         
    <input type="checkbox" id="D2" name="D2" /><a href="http://www.ubuntu.com/rss.xml">
        http://www.ubuntu.com/rss.xml
    </a> [email: 
        onemyndseye@localhost (Default)
    ]<br /> 
    <input type="submit" name="delete_submit" value="Delete Selected" />

And mangles it into complete one-line input statements.. Ready to be inserted into another form:

<input type="checkbox" id="D1" name="D1" /><a href="http://www.linux.com/rss/feeds.php">http://www.linux.com/rss/feeds.php</a> <br />
<input type="checkbox" id="D2" name="D2" /><a href="http://www.ubuntu.com/rss.xml">http://www.ubuntu.com/rss.xml</a> <br />

The big question is how to accomplish this in PHP? I am comfortable with using PHP to curl a page... but it seems I am lost on filtering the output.

Thanks in advance. :)

  • 写回答

1条回答

  • dtxob80644 2010-04-29 22:28
    关注

    You don't filter output. You use simple_html_dom to parse and manipulate that way. it really is more intuitive.

    Something like

    // Create DOM from URL or file
    $html = file_get_html('...');
    
    // Find all a hrefs in a form tag
    foreach($html->find('form a') as $element)
           echo $element->src . '<br>';
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥88 找成都本地经验丰富懂小程序开发的技术大咖
  • ¥15 如何处理复杂数据表格的除法运算
  • ¥15 如何用stc8h1k08的片子做485数据透传的功能?(关键词-串口)
  • ¥15 有兄弟姐妹会用word插图功能制作类似citespace的图片吗?
  • ¥200 uniapp长期运行卡死问题解决
  • ¥15 请教:如何用postman调用本地虚拟机区块链接上的合约?
  • ¥15 为什么使用javacv转封装rtsp为rtmp时出现如下问题:[h264 @ 000000004faf7500]no frame?
  • ¥15 乘性高斯噪声在深度学习网络中的应用
  • ¥15 关于docker部署flink集成hadoop的yarn,请教个问题 flink启动yarn-session.sh连不上hadoop,这个整了好几天一直不行,求帮忙看一下怎么解决
  • ¥15 深度学习根据CNN网络模型,搭建BP模型并训练MNIST数据集