doubei5310 2019-06-16 14:01
浏览 42
已采纳

如何在PHP中捕获与可选空格的链接? [重复]

This question already has an answer here:

From a file_get_contents I get the HTML code of a url.

$html = file_get_contents($url);

Now I would like to capture the href link.

The HTML code is:

<li class="four-column mosaicElement">
<a href="https://example.com" title="Lorem ipsum">
...
</a>
</li>
<li class="four-column mosaicElement">
<a href="https://example.org" title="Lorem ipsum">
...
</a>
</li>

So I'm using this:

preg_match_all('/class=\"four-column mosaicElement\"><a href=\"(.+?)\" title=\"(.+?)"/m', $html, $urls, PREG_SET_ORDER, 0);

foreach ($urls as $key => $url) {
    echo $url[1];
}

How do I solve this problem?

</div>
  • 写回答

3条回答 默认 最新

  • douren2831 2019-06-16 15:35
    关注

    Here, we can also use an expression with positive lookahead and optional spaces, just in case,

    (?=class="four-column mosaicElement")[\s\S]*?href="\s*(https?[^\s]+)\s*"
    

    and our desired URLs are in this group:

    (https?[^\s]+)
    

    DEMO

    TEST

    $re = '/(?=class="four-column mosaicElement")[\s\S]*?href="\s*(https?[^\s]+)\s*"/m';
    $str = '<li class="four-column mosaicElement">
    <a href="https://example.com" title="Lorem ipsum">
    ...
    </a>
    </li>
    <li class="four-column mosaicElement">
    <a href="https://example.org" title="Lorem ipsum">
    
    <li class="four-column mosaicElement">
    <a href="   https://example.org   " title="Lorem ipsum">
    
    <li class="four-column mosaicElement">
    <a href="   https://example.org                " title="Lorem ipsum">
    ...
    </a>
    </li>
    ';
    
    preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
    
    foreach ($matches as $key => $url) {
        echo $url[1] . "
    ";
    }
    

    Output

    https://example.com
    https://example.org
    https://example.org
    https://example.org
    

    RegEx Circuit

    jex.im visualizes regular expressions:

    enter image description here

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥50 potsgresql15备份问题
  • ¥15 Mac系统vs code使用phpstudy如何配置debug来调试php
  • ¥15 目前主流的音乐软件,像网易云音乐,QQ音乐他们的前端和后台部分是用的什么技术实现的?求解!
  • ¥60 pb数据库修改与连接
  • ¥15 spss统计中二分类变量和有序变量的相关性分析可以用kendall相关分析吗?
  • ¥15 拟通过pc下指令到安卓系统,如果追求响应速度,尽可能无延迟,是不是用安卓模拟器会优于实体的安卓手机?如果是,可以快多少毫秒?
  • ¥20 神经网络Sequential name=sequential, built=False
  • ¥16 Qphython 用xlrd读取excel报错
  • ¥15 单片机学习顺序问题!!
  • ¥15 ikuai客户端多拨vpn,重启总是有个别重拨不上