dongluni0568 2014-07-28 11:26
浏览 143
已采纳

正则表达式 - 在href中匹配标签'a'而不包含http://

I have for example these "a" tags:

<a href="http://www.domain.com/products/foo">Foo product</a>
<a href="/articles/bar">Bar article</a>

I use this pattern:

/<a\s[^>]*href\s*=\s*(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>/siU

This expression returns to me both tags (foo product and bar article). Can you help me please how to make an expression that returns only tag "bar article"?

Thank you.

EDIT:

@Avinash Raj thank you for the tip.

These result of the pattern works for me:

/^.*<a\s[^>]*href="http:\/\/.*$(*SKIP)(*F)|<a\s[^>]*href\s*=\s*(\"??)([^\" >]*?)\1[^>]*>(.*?)<\/a>/miU
  • 写回答

2条回答 默认 最新

  • dpglo66848 2014-07-28 12:14
    关注

    Use a DOM parser, such as DOMDocument:

    <?php
    $site = <<<'EOT'
    <a href="http://www.domain.com/products/foo">Foo product</a>
    <a href="/articles/bar">Bar article</a>
    EOT;
    
    $doc = new DOMDocument();
    $doc->loadHTML($site);
    
    $anchors = $doc->getElementsByTagName('a');
    foreach ($anchors as $a) {
        $href = $a->getAttribute('href');
        $scheme = parse_url($href, PHP_URL_SCHEME);
        if (!isset($scheme)) {            
            echo $a->textContent;   // output: Bar article
        }
    }
    

    Loop through each <a> element. Parse the url, using parse_url. If the scheme isn't set in the href attribute, then echo the content. Of course, what you actually want to do with the element is entirely up to you.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 永磁直线电机的电流环pi调不出来
  • ¥15 用stata实现聚类的代码
  • ¥15 请问paddlehub能支持移动端开发吗?在Android studio上该如何部署?
  • ¥170 如图所示配置eNSP
  • ¥20 docker里部署springboot项目,访问不到扬声器
  • ¥15 netty整合springboot之后自动重连失效
  • ¥15 悬赏!微信开发者工具报错,求帮改
  • ¥20 wireshark抓不到vlan
  • ¥20 关于#stm32#的问题:需要指导自动酸碱滴定仪的原理图程序代码及仿真
  • ¥20 设计一款异域新娘的视频相亲软件需要哪些技术支持