doujia4759 2017-05-24 14:32
浏览 43
已采纳

正则表达式 - 如何重复正则表达式代码

I have on the site something like this:

 <div class="latestItemIntroText">

        <div class="itemLinks">
            <div class="share">Share</div>
            <div class="dummy-div"></div>

            <div class="addthis_sharing_toolbox"></div>

        </div>
     Lorem ipsum <br /><br />
     Lorem ipsum <br /><br />
     Lorem ipsum <br /><br />
     Lorem ipsum <br /><br />

 </div>

I need to have this text Lorem ipsum only. I tryed to do this regex code like this:

</div>([\s?]+[^<]+[<br?/?>]*[^<]+[<br?/?>]*[^<]+[<br?/?>]*[^<]+)</div>

I saw that this part I repeat many times :

[^<]+[<br?/?>]* --> because I don't know how many times there will be br with lorem pisum, maybe one, maybe 10 times... is there a possibility to short this regex?

  • 写回答

2条回答 默认 最新

  • doubao6681 2017-05-24 15:19
    关注

    Using Regex for HTML String is not a good approach, instead use DOMDocument for this.

    Try this code snippet here

    <?php
    ini_set('display_errors', 1);
    $string = <<<HTML
    <div class="latestItemIntroText">
    
            <div class="itemLinks">
                <div class="share">Share</div>
                <div class="dummy-div"></div>
    
                <div class="addthis_sharing_toolbox"></div>
    
            </div>
         Lorem ipsum <br /><br />
         Lorem ipsum <br /><br />
         Lorem ipsum <br /><br />
         Lorem ipsum <br /><br />
    
     </div>
    HTML;
    
    $domDocument = new DOMDocument();
    $domDocument->loadHTML($string);
    
    $domXPath = new DOMXPath($domDocument);
    $results = $domXPath->query('//div[@class="itemLinks"]');
    $toRemove[]=$results->item(0);
    foreach($toRemove as $removal)
    {
        $removal->parentNode->removeChild($removal);
    }
    $results = $domXPath->query('//div[@class="latestItemIntroText"]');
    print_r($results->item(0)->textContent);
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥20 西门子S7-Graph,S7-300,梯形图
  • ¥50 用易语言http 访问不了网页
  • ¥50 safari浏览器fetch提交数据后数据丢失问题
  • ¥15 matlab不知道怎么改,求解答!!
  • ¥15 永磁直线电机的电流环pi调不出来
  • ¥15 用stata实现聚类的代码
  • ¥15 请问paddlehub能支持移动端开发吗?在Android studio上该如何部署?
  • ¥20 docker里部署springboot项目,访问不到扬声器
  • ¥15 netty整合springboot之后自动重连失效
  • ¥15 悬赏!微信开发者工具报错,求帮改