PHP Regex preg_match_all div不是同一个id

I have a html page like this

<!DOCTYPE html>
    <html>
        ....
        <body>
            <div class="list-news fl pt10 ">
                Blue
            </div>
            <div class="list-news fl pt10 alternative">
                Yellow
            </div>
             <div class="list-news fl pt10 ">
                Red
            </div>
            <div class="list-news fl pt10 alternative">
                Cyan
            </div>
            <div class="list-news fl pt10 ">
                Black
            </div>
            <div class="list-news fl pt10 alternative">
                White
            </div>
        </body>
    </html>

Now i will write a sort php code for get all content i need

preg_match_all('@<div class="list-news fl pt10 .*?">(.*?)<div class="list-news fl pt10 .*?">@s',$rs,$match);

Now this is result

[1] => Array
(
    [0] => <div>Blue</div></div>
    [1] => <div>Red</div></div>
    [2] => <div>Black</div></div>
)

Result only show content in div <div class="list-news fl pt10 "> and not get content in <div class="list-news fl pt10 alternative"> i can using str_replace for remove alternative class but if don't replace this string, how can get all content in every div match class list-news fl pt10.*??

Thanks for idea.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
douzhuo2722 2014-06-18 01:33
关注
A DOM approach (with a naive contains):

$dom = new DOMDocument(); @$dom->loadHTML($html); $xpath = new DOMXPath($dom); $query = <<<'EOD' //div[ contains(@class, 'list-news') and contains(@class, 'fl') and contains(@class, 'pt10')] EOD; $nodes = $xpath->query($query); $results = array(); foreach ($nodes as $node) { $results[] = trim($node->textContent); } print_r($results);

A regex approach (with a naive pattern):

preg_match_all('~<div class="list-news fl pt10\b[^>]+>\s*\K.*?(?=\s*</div>)~', $html, $matches); print_r($matches[0]);

The two ways are a little naive because contains doesn't care about word boundaries and the classes order, and the regex pattern doesn't care about the possible irregularities of an html code.

The reason your pattern doesn't work is that you can't obtain overlapping matches. Since the first occurrence ends with <div class="list-news..., the next occurrence can't begin with the same <div class="list-news... that has been already matched.

Putting the last <div class="list-news... in a lookahead (?=...) (that is only a check and where the content is not a part of the match result) can be a way. However, it is more simple to use the closing tag </div>.

\K is used to remove all that has been matched before (on the left) from the match result.

A good compromise can be to extract all the div tags that contain a class attribute, and after to check with a regex if the attribute value is really what you want before extracting and triming the text content:

$dom = new DOMDocument(); @$dom->loadHTML($html); $xpath = new DOMXPath($dom); $query = '//div[@class]'; $nodes = $xpath->query($query); $results = array(); foreach($nodes as $node) { if ( preg_match('~(?:\s|^)list-news\s+fl\s+pt10(?:\s|$)~', $node->getAttribute('class')) ) $results = trim($node->textContent); }

or without XPath:

$dom = new DOMDocument(); @$dom->loadHTML($html); $divs = $dom->getElementsByTagName('div'); $results = array(); foreach($divs as $node) { if ( $node->hasAttribute('class') && preg_match('~(?:\s|^)list-news\s+fl\s+pt10(?:\s|$)~', $node->getAttribute('class')) ) $results = trim($node->textContent); }
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

PHP Regex preg_match_all div不是同一个id css html jquery php
2014-06-18 01:08

回答 1 已采纳 A DOM approach (with a naive contains): $dom = new DOMDocument(); @$dom->loadHTML($html); $xp
Php preg_match_all仅匹配最后一个元素 php
2019-07-19 08:34

回答 2 已采纳 Here is another variant using \G that is bit faster and avoids empty matches: (?:{{([\w-]+(?:\h+[
如果模式不匹配，如何使preg_match_all返回一个空数组值？ php
2017-10-23 16:11

回答 2 已采纳 It looks like each iteration can only return a maximum of one match, so preg_match_all with the in
php regex match,基本php regex问题
2021-04-11 10:28

狸如鲤伴的博客 preg_match('/(.*?)/i', $source, $matches);print_r($matches);这是RegexBuddy的“解释:(.*?)Options: case insensitiveMatch the characters ââ literally Â«Â»Match the regular expression below ...
PHP preg_match_all谜语 php
2018-07-30 22:29

回答 1 已采纳 /<tr>.*?class="DD.*?/ says "find <tr>, then match everything until you find class="D
使用正则表达式和php preg_match_all在括号之间获取字符串 php
2017-07-14 12:34

回答 2 已采纳 This method will extract your desired substrings and prepare the output data as you have requested
来自特定函数名的PHP Regex preg_match参数 php
2016-02-26 02:13

回答 1 已采纳 This should work: $funcName = 'FooBar'; $myFuncString = $funcName.'("Arg1, Arg2")'; //$funcName
php获取div下的文本,php 抓取div内容
2021-03-23 15:57

weixin_39721924的博客 1. 取得指定網頁內的所有圖片：測試//取得指定位址的內容，並...//取得所有img標籤，並儲存至二維陣列matchpreg_match_all('#]*>#i',$text,$match); //印出matchprint_r($match);?>//取得指定位址的內容，並儲...
php preg_match_all简单正则表达式返回空值 php
2015-11-05 10:42

回答 4 已采纳 You need to replace: preg_match_all('/\d*/', $string, $matches); with: preg_match_all('/\d+/',
php preg_match_all函数参数在php文件中的一个函数 php
2017-02-07 20:19

回答 1 已采纳 try this $content = "The text: Some text functionName('MATCH_1'); and other text functionName( \
PHP regex preg_match_all：在多个方括号之间捕获文本 php
2013-05-01 12:11

回答 3 已采纳 This text appears to be in fairly normalized format, ala JSON. It's entirely possible to avoid reg
php 正则div 所有内容,php正则匹配html中带class的div并选取其中内容的方法，classdiv_PHP教程...
2021-04-21 19:20

浪斌的博客 php正则匹配html中带class的div并选取其中内容的方法，classdiv本文实例讲述了php正则匹配html中带class的div并选取其中内容的方法。分享给大家供大家参考。具体分析如下：先看一段html代码：代码如下:潮汐表数据仅...
php regex preg_match_all无法正常工作 php
2012-12-15 01:39

回答 1 已采纳 $string = '12.10.1990 as well as 12.10.90'; preg_match_all('/[01]\d\.[0-3]\d\.\d{2,4}/', $string,
php 正则获取某个div,php正则匹配html中带class的div并选取其中内容的方法
2021-03-25 10:57

风龙云虎的博客本文实例讲述了php正则匹配html中带class的div并选取其中内容的方法。。具体分析如下：先看一段html代码：代码如下:潮汐表数据仅供参考潮时 (Hrs)00:5805:2013:2821:15潮高 (cm)16175288127时区：-1000 (东10区) 潮...
php 正则div内容,正则表达式php – 使用特定ID在div中查找内容
2021-04-11 12:51

徐三守的博客我确信这是一个简单的问题,就像我在谷歌上搜索和搜索一样 – 我似乎...我有一个具有特定ID“div-user-sub-commhome”的div – 我想从该div中提取文本.文本被标签包围但我可以轻松地使用strip_tags来获取那些标签....
没有解决我的问题, 去提问

悬赏问题

¥15 metadata提取的PDF元数据，如何转换为一个Excel
¥15 关于arduino编程toCharArray()函数的使用
¥100 vc++混合CEF采用CLR方式编译报错
¥15 coze 的插件输入飞书多维表格 app_token 后一直显示错误，如何解决？
¥15 vite+vue3+plyr播放本地public文件夹下视频无法加载
¥15 c#逐行读取txt文本，但是每一行里面数据之间空格数量不同
¥50 如何openEuler 22.03上安装配置drbd
¥20 ING91680C BLE5.3 芯片怎么实现串口收发数据
¥15 无线连接树莓派，无法执行update，如何解决？（相关搜索：软件下载）
¥15 Windows11, backspace, enter, space键失灵

PHP Regex preg_match_all div不是同一个id

1条回答 默认 最新

悬赏问题

1条回答默认最新