douge3113 2014-05-19 08:09
浏览 32
已采纳

正则表达式用于修剪HTML标记中包含的字符串的空格

I've this HTML string (validated):

<div><img src="images/stories/2014/AAA.gif" alt="AAA" width="24" height="24" /> THE PRODUCTION OF: PLASTIC BOTTLES   <br /></div>

I've to extract the only title near <img> tag trimming all spaces before and after, than wrap it in a <h1> tag. The expeded result should be:

<div><h1>THE PRODUCTION OF: PLASTIC BOTTLES</h1></div>

I've done a regular expression that works but that also include the spaces in the final result:

/<img\s*src="[^"]+"\s*alt="AAA"\s*width="24"\s*height="24"\s*\/>\s*([^<]+)\s*<br\s*\/>/

The image is recognizable for these characteristics values of alt, width and height attributes. Thanks.

  • 写回答

3条回答 默认 最新

  • doz97171 2014-05-19 08:17
    关注

    Making your match non greedy should do the trick: <img\s*src="[^"]+"\s*alt="AAA"\s*width="24"\s*height="24"\s*\/>\s*([^<]+?)\s*<br\s*\/> (notice the extra ? next to [^<]+). More information available here.

    That being said, you should really be using something like the PHP DOM Parser to process HTML.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 在不同的执行界面调用同一个页面
  • ¥20 基于51单片机的数字频率计
  • ¥50 M3T长焦相机如何标定以及正射影像拼接问题
  • ¥15 keepalived的虚拟VIP地址 ping -s 发包测试,只能通过1472字节以下的数据包(相关搜索:静态路由)
  • ¥20 关于#stm32#的问题:STM32串口发送问题,偶校验(even),发送5A 41 FB 20.烧录程序后发现串口助手读到的是5A 41 7B A0
  • ¥15 C++map释放不掉
  • ¥15 Mabatis查询数据
  • ¥15 想知道lingo目标函数中求和公式上标是变量情况如何求解
  • ¥15 关于E22-400T22S的LORA模块的通信问题
  • ¥15 求用二阶有源低通滤波将3khz方波转为正弦波的电路