dscojuxf69080 2016-08-17 13:44
浏览 40
已采纳

使用preg_replace删除锚标记后的空格

I want to put a space after anchor tag so that the next word becomes separate from it. The problem is there are anchor tags after which there is   characters or there could be another html tag opening. So in those cases we do not want to put a space as it will break our records.

I only want to put space after anchor if there is no space and there is a word.

Right now i have come up with regex which i am not sure is exactly what i want

 preg_replace("/\<\/a\>([^\s<&nbsp;])/", '</a> $1', $text, -1, $count);
 print "Number of occurence in type $type = $count 
";
 $this->count += $count;

I tried to see the number of occurence before i actually save the replaced string. But it is showing way higher amount which i highly doubt cannot be.

Please help me fixing this regex.

Scenarios:

<a href="blah.com">Hello</a>World // Here we need to put space between Hello and World

<a href="blah.com">Hello</a>&nbsp;World // Do not touch this

<a href="blah.com">Hello</a><b>World</b> // do not touch this

There could be so many cases that has to be ignore but specifically speaking we need the first scenario to be executed

  • 写回答

3条回答 默认 最新

  • dosin84644 2016-08-17 14:07
    关注

    As @trincot pointed out [^\s<&nbsp;] doesn't mean if it is not a space or non-breaking space. It's a character class and whatever is between those brackets has a mean of a single character only. So it means if it is not a space or < or & or...

    You need to check if very next character is a word character \w which denotes [a-zA-Z0-9_], then consider to add an space at zero-width assertion of used positive lookahead:

     preg_replace("~</a>\K(?=\w)~", ' ', $text, -1, $count);
     echo "Number of occurrences in type $type is $count 
    ";
    

    What does this RegEx mean?

    </a>    # Match closing anchor tag
    \K      # Reset match
    (?=\w)  # Look if next character is a word character
    

    Update: Another solution to cover all HTML-problematic cases:

    preg_replace("~</a>\K(?!&nbsp;)~", '&nbsp;', $text, -1, $count);
    

    This adds a non-breaking space when there is no non-breaking space after closing anchor tag.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 用visual studi code完成html页面
  • ¥15 聚类分析或者python进行数据分析
  • ¥15 逻辑谓词和消解原理的运用
  • ¥15 三菱伺服电机按启动按钮有使能但不动作
  • ¥15 js,页面2返回页面1时定位进入的设备
  • ¥50 导入文件到网吧的电脑并且在重启之后不会被恢复
  • ¥15 (希望可以解决问题)ma和mb文件无法正常打开,打开后是空白,但是有正常内存占用,但可以在打开Maya应用程序后打开场景ma和mb格式。
  • ¥20 ML307A在使用AT命令连接EMQX平台的MQTT时被拒绝
  • ¥20 腾讯企业邮箱邮件可以恢复么
  • ¥15 有人知道怎么将自己的迁移策略布到edgecloudsim上使用吗?