dov6891 2017-03-21 15:24
浏览 54
已采纳

将所有内容匹配到精确div中的最后一个确切字符串(php)

I want to match the address of a property on realty server. Lets say the div containing address is named <div class="title"> and the address is located in the last <h2> section like this:

<body>
  <div class="price">
    <h2>
      h2
    </h2>
  </div>      
  <div class="title">
    <abcd>
      abcd
    </abcd>
    <efg>
      efg
    </efg>
    <h2>
      adress
    </h2>
 </div>
</body>

Is there a possible way to capture an address by only one regex, even if it will be in some captured group?

My not working solution is:

regex="/<div class="title">everything_except_<h2>*([^<]*)/";
  • 写回答

1条回答 默认 最新

  • dongshou2024 2017-03-21 15:31
    关注

    Try this regex:

    <div class="title">(?:.(?!<\/div>))*<h2>([^<]*)
    

    The main point here is to make .* after <div class="title"> greedy but match only until </div> is found. So the regex limits the . with only those occurrences that are not followed by </div> (which gives us (?:.(?!<\/div>))* as a result).

    Demo: https://regex101.com/r/2EGXue/1

    Update:

    If nested divs may occur but only one level of nesting is possible and the required <h2>...</h2> is not within any of those divs (as it happens in the provided data sample), the greedily matching pattern (.(?!<\/div>)) should be extended to match either "not <div ...>...</div>" (which is <div.*?<\/div>) or just "not </div>" (.(?!<\/div>)):

    <div class="title">(?:<div.*?<\/div>|.(?!<\/div>))*<h2>([^<]*)
    

    Demo: https://regex101.com/r/IGLhBZ/1

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 微信会员卡接入微信支付商户号收款
  • ¥15 如何获取烟草零售终端数据
  • ¥15 数学建模招标中位数问题
  • ¥15 phython路径名过长报错 不知道什么问题
  • ¥15 深度学习中模型转换该怎么实现
  • ¥15 HLs设计手写数字识别程序编译通不过
  • ¥15 Stata外部命令安装问题求帮助!
  • ¥15 从键盘随机输入A-H中的一串字符串,用七段数码管方法进行绘制。提交代码及运行截图。
  • ¥15 TYPCE母转母,插入认方向
  • ¥15 如何用python向钉钉机器人发送可以放大的图片?