douxiyi2418 2016-02-12 11:33
浏览 159
已采纳

正则表达式捕获最后一个子模式

I'm trying to build a regex that will replace the tokens %aa% and %cc% inside a string. All the cases are listed below:

1) /%aa%/%cc%/bb   => should replace only %cc%
2) /%aa%/%cc%/ac   => should replace only %cc%
3) /bb/%aa%/%cc%   => should replace only the last %cc%
4) /bb/%aa%        => should replace %aa%
5) /bb/ac/%aa%/%cc%/ac/bb => should replace only the last %cc%

I have the following regex which covers most of the case expect 2 and 5, basically those that contain the same chars as the tokens.

Regex pattern: %(?|(?|aa)|(?|cc))%(?=[^(aa|cc)]*($)+)

Language is PHP.

Thanks.

  • 写回答

1条回答 默认 最新

  • doulou0882 2016-02-12 11:41
    关注

    Your regex contains redundant branch reset groups ((?|...|...)) and the corrupt grouping that is placed into a character class [^(aa|cc)]*, and an end of string anchor quantified (($)+), which is also a user error (no need to capture the anchor here and it is enough to test it once).

    You can use the following regex:

    '~%(?:aa|cc)%(?!.*%(?:aa|cc)%)~'
    

    See regex demo

    For standalone strings, you can also add the ~s singleline (DOTALL) modifier: '~%(?:aa|cc)%(?!.*%(?:aa|cc)%)~s'.

    The (?!.*%(?:aa|cc)%) negative lookahead fails a match if either aa or cc appear after the aa or cc found so far.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 机器学习训练相关模型
  • ¥15 Todesk 远程写代码 anaconda jupyter python3
  • ¥15 我的R语言提示去除连锁不平衡时clump_data报错,图片以下所示,卡了好几天了,苦恼不知道如何解决,有人帮我看看怎么解决吗?
  • ¥15 在获取boss直聘的聊天的时候只能获取到前40条聊天数据
  • ¥20 关于URL获取的参数,无法执行二选一查询
  • ¥15 液位控制,当液位超过高限时常开触点59闭合,直到液位低于低限时,断开
  • ¥15 marlin编译错误,如何解决?
  • ¥15 有偿四位数,节约算法和扫描算法
  • ¥15 VUE项目怎么运行,系统打不开
  • ¥50 pointpillars等目标检测算法怎么融合注意力机制