doucuo9126 2011-02-02 13:00
浏览 151
已采纳

用于解析斜体文本的正则表达式?

Suppose I have the following text:

__This_is__ a __test__

Using two underscores for denoting italics. So I expect This_is and test to be italicized. The logic dictates that any text between two consecutive double underscores should be italicized, including any other number of underscores that may be there. I've got:

__([^_]+)__

What is the equivalent of "not two consecutive underscores" in group 1? Thanks.

  • 写回答

2条回答 默认 最新

  • drwghu6386 2011-02-02 13:21
    关注

    An option would be to match two underscores:

    __
    

    Then make a negative look ahead to see if theres no two underscores ahead of the current position:

    __(?!__)
    

    if that is not the case, match any character:

    __(?!__). 
    

    and repeat the previous one or more times:

    __((?!__).)+
    

    and finally match another two underscores:

    __((?!__).)+__
    

    which is the final solution.

    A little demo:

    <?php
    $text = '__This_is__ a __test__';
    preg_match_all('/__(?:(?!__).)+__/', $text, $matches);
    print_r($matches);
    ?>
    

    produces:

    Array
    (
        [0] => Array
            (
                [0] => __This_is__
                [1] => __test__
            )
    
    )
    

    as can be seen on Ideone.

    EDIT

    Note that I used a non-capturing group in my demo, otherwise the output would have looked like this:

    Array
    (
        [0] => Array
            (
                [0] => __This_is__
                [1] => __test__
            )
    
        [1] => Array
            (
                [0] => s
                [1] => t
            )
    
    )
    

    i.e. the last character matched by ((?!__).) would have been captured in group 1.

    More about groups, see: http://www.regular-expressions.info/brackets.html

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?