doujia4759 2017-05-24 14:32
浏览 43
已采纳

正则表达式 - 如何重复正则表达式代码

I have on the site something like this:

 <div class="latestItemIntroText">

        <div class="itemLinks">
            <div class="share">Share</div>
            <div class="dummy-div"></div>

            <div class="addthis_sharing_toolbox"></div>

        </div>
     Lorem ipsum <br /><br />
     Lorem ipsum <br /><br />
     Lorem ipsum <br /><br />
     Lorem ipsum <br /><br />

 </div>

I need to have this text Lorem ipsum only. I tryed to do this regex code like this:

</div>([\s?]+[^<]+[<br?/?>]*[^<]+[<br?/?>]*[^<]+[<br?/?>]*[^<]+)</div>

I saw that this part I repeat many times :

[^<]+[<br?/?>]* --> because I don't know how many times there will be br with lorem pisum, maybe one, maybe 10 times... is there a possibility to short this regex?

  • 写回答

2条回答 默认 最新

  • doubao6681 2017-05-24 15:19
    关注

    Using Regex for HTML String is not a good approach, instead use DOMDocument for this.

    Try this code snippet here

    <?php
    ini_set('display_errors', 1);
    $string = <<<HTML
    <div class="latestItemIntroText">
    
            <div class="itemLinks">
                <div class="share">Share</div>
                <div class="dummy-div"></div>
    
                <div class="addthis_sharing_toolbox"></div>
    
            </div>
         Lorem ipsum <br /><br />
         Lorem ipsum <br /><br />
         Lorem ipsum <br /><br />
         Lorem ipsum <br /><br />
    
     </div>
    HTML;
    
    $domDocument = new DOMDocument();
    $domDocument->loadHTML($string);
    
    $domXPath = new DOMXPath($domDocument);
    $results = $domXPath->query('//div[@class="itemLinks"]');
    $toRemove[]=$results->item(0);
    foreach($toRemove as $removal)
    {
        $removal->parentNode->removeChild($removal);
    }
    $results = $domXPath->query('//div[@class="latestItemIntroText"]');
    print_r($results->item(0)->textContent);
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 MYSQL 多表拼接link
  • ¥15 关于某款2.13寸墨水屏的问题
  • ¥15 obsidian的中文层级自动编号
  • ¥15 同一个网口一个电脑连接有网,另一个电脑连接没网
  • ¥15 神经网络模型一直不能上GPU
  • ¥15 pyqt怎么把滑块和输入框相互绑定,求解决!
  • ¥20 wpf datagrid单元闪烁效果失灵
  • ¥15 券商软件上市公司信息获取问题
  • ¥100 ensp启动设备蓝屏,代码clock_watchdog_timeout
  • ¥15 Android studio AVD启动不了