douduan2272 2014-05-13 02:05
浏览 100
已采纳

RegEx验证逗号分隔的选项列表

I'm using PHP's Filter Functions (FILTER_VALIDATE_REGEXP specifically) to validate the input data. I have a list of options and the $input variable can specify a number of options from the list.

The options are (case-insensitive):

  1. all
  2. rewards
  3. join
  4. promotions
  5. stream
  6. checkin
  7. verified_checkin

The $input variable can have almost any combination of the values. The possible success cases are:

  • all (value can either be all or a comma separated list of other values but not both)
  • rewards,stream,join (a comma separated list of values excluding all)
  • join (a single value)

The Regular Expression I've been able to come up with is:

/^(?:all|(?:checkin|verified_checkin|rewards|join|promotions|stream)?(?:,(?:checkin|verified_checkin|rewards|join|promotion|stream))*)$/

So far, it works for the following example scenarios:

  • all (passes)
  • rewards,join,promotion,checkin,verified_checkin (passes)
  • join (passes)

However, it lets a value with a leading comma and duplicates through:

  • ,promotion,checkin,verified_checkin (starts with a comma but also passes when it shouldn't)

Also, checking for duplicates would be a bonus, but not necessarily required.

  • rewards,join,promotion,checkin,join,verified_checkin (duplicate value but still passes but not as critical as a leading comma)

I've been at it for a couple of days now and having tried various implementations, this is the closest I've been able to get.

Any ideas on how to handle the leading comma false positive?

UPDATE: Edited the question to better explain that duplicate filtering isn't really a requirement, just a bonus.

  • 写回答

1条回答 默认 最新

  • dongyang4615 2014-05-13 02:08
    关注

    Sometimes regular expressions just make things more complicated than they should be. Regular expressions are really good at matching patterns, but when you introduce external rules that have dependencies on the number of matched patterns things get complicated fast.

    In this case I would just split the list on comma and check the resulting strings against the rules you just described.

    $valid_choices = array('checkin','join','promotions','rewards','stream','verified_checkin');
    
    $input_string;                       // string to match
    
    $tokens = explode(',' $input_string);
    
    $tokens = asort($tokens);            // sort to tokens to make it easy to find duplicates
    
    if($tokens[0] == 'all' && count($tokens) > 1)
        return FALSE;                    // fail (all + other options)
    
    if(!in_array($tokens[0], $valid_choices))
        return FALSE;                    // fail (invalid first choice)
    
    for($i = 1; $i < count($tokens); $i++)
    {
        if($tokens[$i] == $tokens[$i-1])
           return FALSE;                 // fail (duplicates)
    
        if(!in_array($tokens[$i], $valid_choices))
           return FALSE;                 // fail (choice not valid)
    }
    

    EDIT

    Since you edited your and specified that duplicates would be acceptable but you definitely want a regex-based solution then this one should do:

    ^(all|((checkin|verified_checkin|rewards|join|promotions|stream)(,(checkin|verified_checkin|rewards|join|promotion|stream))*))$
    

    It will not fail on duplicates but it will take care or leading or trailing commas, or all + other choices combination.

    Filtering out duplicates with a regex would be pretty difficult but maybe not impossible (if you use a look-ahead with a capture group placeholder)

    SECOND EDIT

    Although you mentioned that detecting duplicate entries is not critical I figured I'd try my hand at crafting a pattern that would also check for duplicate entries.

    As you can see below, it's not very elegant, nor is it easily scalable but it does get the job done with the finite list of options you have using negative look-ahead.

    ^(all|(checkin|verified_checkin|rewards|join|promotions|stream)(,(?!\2)(checkin|verified_checkin|rewards|join|promotions|stream))?(,(?!\2)(?!\4)(checkin|verified_checkin|rewards|join|promotions|stream))?(,(?!\2)(?!\4)(?!\6)(checkin|verified_checkin|rewards|join|promotions|stream))?(,(?!\2)(?!\4)(?!\6)(?!\8)(checkin|verified_checkin|rewards|join|promotions|stream))?(,(?!\2)(?!\4)(?!\6)(?!\8)(?!\10)(checkin|verified_checkin|rewards|join|promotions|stream))?)$
    

    Since the final regex is so long, I'm going to break it up into parts for the sake of making it easier to follow the general idea:

    ^(all|
      (checkin|verified_checkin|rewards|join|promotions|stream)
      (,(?!\2)(checkin|verified_checkin|rewards|join|promotions|stream))?
      (,(?!\2)(?!\4)(checkin|verified_checkin|rewards|join|promotions|stream))?
      (,(?!\2)(?!\4)(?!\6)(checkin|verified_checkin|rewards|join|promotions|stream))?
      (,(?!\2)(?!\4)(?!\6)(?!\8)(checkin|verified_checkin|rewards|join|promotions|stream))?
      (,(?!\2)(?!\4)(?!\6)(?!\8)(?!\10)(checkin|verified_checkin|rewards|join|promotions|stream))?
     )$/
    

    You can see that the mechanism to form the pattern is somewhat iterative and such a pattern could be generated automatically by an algorithm if you wanted to provide a different list but the resulting pattern would get rather large, rather quickly.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 用matlab 实现通信仿真
  • ¥15 按键修改电子时钟,C51单片机
  • ¥60 Java中实现如何实现张量类,并用于图像处理(不运用其他科学计算库和图像处理库))
  • ¥20 5037端口被adb自己占了
  • ¥15 python:excel数据写入多个对应word文档
  • ¥60 全一数分解素因子和素数循环节位数
  • ¥15 ffmpeg如何安装到虚拟环境
  • ¥188 寻找能做王者评分提取的
  • ¥15 matlab用simulink求解一个二阶微分方程,要求截图
  • ¥30 乘子法解约束最优化问题的matlab代码文件,最好有matlab代码文件