dongle7637 2018-08-10 03:17
浏览 217
已采纳

正则表达式找到html div类内容和data-attr? (preg_match_all)

With preg_match_all I want to get class and data-attributes in html.

The example below works, but it only returns class names or only data-id content.

I want the example pattern to find both class and data-id content.

Which regex pattern should I use?

Html contents:

<!-- I want to: $matches[1] == test_class  | $matches[2] == null -->
<div class="test_class"> 

<!-- I want to: $matches[1] == test_class | $matches[2] == 1 -->
<div class="test_class" data-id="1"> 

<!-- I want to: $matches[1] == test_class | $matches[2] == 1 -->
<div id="test_id" class="test_class" data-id="1">

<!-- I want to: $matches[1] == test_class test_class2 | $matches[2] == 1 -->
<div class="test_class test_class2" id="test_id" data-id="1">

<!-- I want to: $matches[1] == 1 | $matches[2] == test_class test_class2 -->
<div data-id="1" class="test_class test_class2" id="test_id" >

<!-- I want to: $matches[1] == 1 | $matches[2] == test_class test_class2 -->
<div id="test_id" data-id="1" class="test_class test_class2">

<!-- I want to: $matches[1] == test_class | $matches[2] == 1 -->
<div class="test_class" id="test_id" data-id="1">

The regex that does not work as I want:

$pattern = '/<(div|i)\s.*(class|data-id)="([^"]+)"[^>]*>/i';

preg_match_all($pattern, $content, $matches, PREG_SET_ORDER);

Thanks in advance.

  • 写回答

2条回答 默认 最新

  • doujue1246 2018-08-10 03:43
    关注

    Why not use a DOM parser instead?

    You could use an XPath expression like //div[@class or @data-id] to locate the elements then extract their attribute values

    $doc = new DOMDocument();
    $doc->loadHTML($html);
    
    $xpath = new DOMXpath($doc);
    $divs = $xpath->query('//div[@class or @data-id]');
    foreach ($divs as $div) {
      $matches = [$div->getAttribute('class'), $div->getAttribute('data-id')];
      print_r($matches);
    }
    

    Demo ~ https://eval.in/1046227

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 java 操作 elasticsearch 8.1 实现 索引的重建
  • ¥15 数据可视化Python
  • ¥15 要给毕业设计添加扫码登录的功能!!有偿
  • ¥15 kafka 分区副本增加会导致消息丢失或者不可用吗?
  • ¥15 微信公众号自制会员卡没有收款渠道啊
  • ¥15 stable diffusion
  • ¥100 Jenkins自动化部署—悬赏100元
  • ¥15 关于#python#的问题:求帮写python代码
  • ¥20 MATLAB画图图形出现上下震荡的线条
  • ¥15 关于#windows#的问题:怎么用WIN 11系统的电脑 克隆WIN NT3.51-4.0系统的硬盘