dsyo9700 2012-10-09 09:06
浏览 5
已采纳

更好的条带化方法php正则表达式

please help me strip the following more efficiently.

a href="/mv/test-1-2-3-4.vFIsdfuIHq4gpAnc.html"

the site I visit has a few of those, I would only need everything in between the two periods:

vFIsdfuIHq4gpAnc

I would like to use my current format and coding that works around the regex environment. Please help me tune up my following preg match line:

preg_match_all("(./(.*?).html)", $sp, $content); 

Any kind of help I get on this is greatly appreciated and thank you in advance!

Here is my complete code

$dp = "http://www.cnn.com";

$sp = @file_get_contents($dp);
if ($sp === FALSE) {
    echo("<P>Error: unable to read the URL $dp.  Process aborted.</P>");
    exit();
}

preg_match_all("(./(.*?).html)", $sp, $content); 

foreach($content[1] as $surl) {
    $nctid = str_replace("mv/","",$surl);
    $nctid = str_replace("/","",$nctid);
   echo $nctid,'<br /><br /><br />';

the above is what I have been working on

  • 写回答

4条回答 默认 最新

  • douwo3665 2012-10-09 09:11
    关注

    It's pretty okay, really. It's just that you don't want to match .*?, you want to match multiple characters that aren't a full stop, so you can use [^.]+ instead.

    $sp = 'a href="/mv/test-1-2-3-4.vFIsdfuIHq4gpAnc.html"';
    preg_match_all( '/\.([^.]+).html/', $sp, $content );
    
    var_dump( $content[1] );
    

    The result that is printed:

    array(1) {
      [0]=>
      string(16) "vFIsdfuIHq4gpAnc"
    }
    

    Here's an example of how to loop through all links:

    <?php
    $url = 'http://www.cnn.com';
    
    $dom = new DomDocument( );
    @$dom->loadHTMLFile( $url );
    
    $links = $dom->getElementsByTagName( 'a' );
    
    foreach( $links as $link ) {
        $href = $link->attributes->getNamedItem( 'href' );
        if( $href !== null ) {
            if( preg_match( '~mv/.*?([^.]+).html~', $href->nodeValue, $matches ) ) {
                echo "Link-id found: " . $matches[1] . "
    ";
            }
        }
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(3条)

报告相同问题?

悬赏问题

  • ¥15 pyqt5tools安装失败
  • ¥15 mmdetection
  • ¥15 nginx代理报502的错误
  • ¥100 当AWR1843发送完设置的固定帧后,如何使其再发送第一次的帧
  • ¥15 图示五个参数的模型校正是用什么方法做出来的。如何建立其他模型
  • ¥100 描述一下元器件的基本功能,pcba板的基本原理
  • ¥15 STM32无法向设备写入固件
  • ¥15 使用ESP8266连接阿里云出现问题
  • ¥15 BP神经网络控制倒立摆
  • ¥20 要这个数学建模编程的代码 并且能完整允许出来结果 完整的过程和数据的结果