doujue9767 2015-04-08 12:39
浏览 308
已采纳

如何使用正则表达式获取数字后面的字符

Suppose I have the following line:

1309270927C1642,61N654NONREF

Now I want to get the C or D after the first digits. Now there are a few rules here

  1. The first 6 digits are always there
  2. The 4 digits after that are optional
  3. After that you have a D or a C.

Now I wanted to solve that with a look behind:

/(?<=\d{6,10})D|C/ but that is not allowed in PHP.

So I tried a non capturing group /(?:\d{6,10})D|C/. But that captures 1309270927C in stead of just C.

So my question is how can I just capture the D or a C?

  • 写回答

2条回答 默认 最新

  • doudui2229 2015-04-08 12:41
    关注

    You can use PCRE \K operator:

    \d{6,10}\K[DC]
    

    It will omit everything in the match up to D or C. You may further tweak this regex allowing or disallowing more characters to the character class [DC].

    Have a look at the example.

    Sample code:

    $re = "/\\d{6,10}\\K[DC]/"; 
    $str = "1309270927C1642,61N654NONREF"; 
    preg_match_all($re, $str, $matches);
    

    Also, here is some more information on \K operator:

    The \K "keep out" verb, which is available in Perl, PCRE (C, PHP, R…) and Ruby 2+. \K tells the engine to drop whatever it has matched so far from the match to be returned.

    Instead of (?<=\b\d+_)[A-Z]+, you can therefore use \b\d+_\K[A-Z]+

    The limitations of \K:

    Compared with lookbehinds, both the \K and capture group workarounds have limitations:

    ✽ When you look for multiple matches in a string, at the starting position of each match attempt, a lookbehind can inspect the characters behind the current position in the string. Therefore, against 123, the pattern (?<=\d)\d (match a digit preceded by a digit) will match both 2 and 3. In contrast, \d\K\d can only match 2, as the starting position after the first match is immediately before the 3, and there are not enough digits left for a second match. Likewise, \d(\d) can only capture 2.

    ✽ With lookbehinds, you can impose multiple conditions (similar to our password validation technique) by using multiple lookbehinds. For instance, to match a digit that is preceded by a lower-case Greek letter, you can use (?<=\p{Ll})(?<=\p{Greek})\d. The first lookbehind (?<=\p{Ll}) ensures that the character immediately to the left is a lower-case letter, and the second lookbehind (?<=\p{Greek}) ensures that the character immediately to the left belongs to the Greek script. With the workarounds, you could use \p{Greek}\K\d to match a digit preceded by a character in the Greek script (or \p{Greek}(\d) to capture it), but you cannot impose a second condition. To get over this limitation, you could capture the Greek character and use a second regex to check that it is a lower-case letter.

    Output:

    C
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥100 set_link_state
  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号
  • ¥15 基于单片机的靶位控制系统
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度