dongtan2603 2013-11-06 01:53
浏览 73
已采纳

正则表达式删除少于3个字符的“字符组”

I am trying to remove any 'groups of characters' with less than 3 characters.

This is the source:

1.29 Cancels part plan C/5879 2030. in i i.r e9g6Pop Iatian Area ProcH 22.4.93 Suburban Lands n f 53dv 3 N014 3.5.98. PLAN or any from 01 53 under M R.5I B.L.1laY98 E35. P0 RT I 0 N S At Maroubrajuncti p /I .z. .0 / .L .I. .I

Settings bounds for word characters with repetition between 1 and 3 e.g. /b\w{1,3}\b/ does not work as "C/5879" would become "5879".

The desired output would be as follows:

1.29 Cancels part plan C/5879 2030. e9g6Pop Iatian Area ProcH 22.4.93 Suburban Lands 53dv N014 3.5.98. PLAN from under R.5I B.L.1laY98 E35. Maroubrajuncti

An alternative which could also work would be to create larger 'groups of characters' by joining 'groups of characters' with 2 or less characters delimited by a whitespace.

For example:

1.29 Cancels part plan C/5879 2030. inii.r e9g6Pop Iatian Area ProcH 22.4.93 Suburban Lands nf 53dv 3N014 3.5.98. PLAN orany from 0153 under MR.5I B.L.1laY98 E35. P0RTI0NS AtMaroubrajuncti p/I.z. .0/.L.I..I

I would be open to either solution to rescue me from Regex Hell.

  • 写回答

1条回答 默认 最新

  • duanpanbo9476 2013-11-06 02:13
    关注

    Your definition of "words" is "whitespace delimited", which differ from regex's defitionition of "word to non-word", so use look arounds:

    \s+\S{1,3}(?=\s)
    

    Note that the expression includes (captures) leading spaces, so removing matches will not leave double spaces in the result.

    When tested on regextester result is:

    1.29 Cancels part plan C/5879 2030. e9g6Pop Iatian Area ProcH 22.4.93 Suburban Lands 53dv N014 3.5.98. PLAN from under R.5I B.L.1laY98 E35. Maroubrajuncti .I

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?
  • ¥15 c++头文件不能识别CDialog