dongyi6668 2016-05-13 16:26
浏览 49
已采纳

RegEx替换嵌套结构中的匹配括号[关闭]

How can I replace a set of matching opening/closing parentheses if the first opening parenthesis follows the keyword array? Can regular expressions help with this type of problem?

In order to be more specific, I'd like to solve this using either JavaScript or PHP

// input
$data = array(
    'id' => nextId(),
    'profile' => array(
       'name' => 'Hugo Hurley',
       'numbers' => (4 + 8 + 15 + 16 + 23 + 42) / 108
    )
);

// desired output
$data = [
    'id' => nextId(),
    'profile' => [
       'name' => 'Hugo Hurley',
       'numbers' => (4 + 8 + 15 + 16 + 23 + 42) / 108
    ]
];
  • 写回答

2条回答 默认 最新

  • drhqkz3455 2016-05-13 17:50
    关注

    Tim Pietzcker gave the Dot-Net counting version.
    It has the same elements as the PCRE (php) version below.

    All the caveats are the same. In particular, non-array parenthesis must
    be balanced because they use the same closing parenthesis as delimiters.

    All text must be parsed (or should be).
    The outer groups 1, 2, 3, 4 allow you to get the parts
    CONTENT
    CORE-1 array()
    CORE-2 any ()
    EXCEPTIONS

    Each match gets you one of these outer things and are mutually exclusive.

    The trick is to define a php function parse( core) that parses the CORE.
    Inside that function is the while (regex.search( core ) { .. } loop.

    Each time either CORE-1 or 2 groups match, call the parse( core ) function passing
    the contents of that core's group to it.

    And inside the loop, just take off content and assign it to the hash.

    Obviously, the group 1 construct which calls (?&content) should be replaced
    with constructs to obtain your hash like variable data.

    On a detailed scale, this can be very tedious.
    Usually, you'd have to account for every single character to correctly
    parse the entire thing.

    (?is)(?:((?&content))|(?>\barray\s*\()((?=.)(?&core)|)\)|\(((?=.)(?&core)|)\)|(\barray\s*\(|[()]))(?(DEFINE)(?<core>(?>(?&content)|(?>\barray\s*\()(?:(?=.)(?&core)|)\)|\((?:(?=.)(?&core)|)\))+)(?<content>(?>(?!\barray\s*\(|[()]).)+))
    

    Expanded

     # 1:  CONTENT
     # 2:  CORE-1
     # 3:  CORE-2
     # 4:  EXCEPTIONS
    
     (?is)
    
     (?:
          (                                  # (1), Take off   CONTENT
               (?&content) 
          )
       |                                   # OR -----------------------------
          (?>                                # Start 'array('
               \b array \s* \(
          )
          (                                  # (2), Take off   'array( CORE-1 )'
               (?= . )
               (?&core) 
            |  
          )
          \)                                 # End ')'
       |                                   # OR -----------------------------
          \(                                 # Start '('
          (                                  # (3), Take off   '( any CORE-2 )'
               (?= . )
               (?&core) 
            |  
          )
          \)                                 # End ')'
       |                                   # OR -----------------------------
          (                                  # (4), Take off   Unbalanced or Exceptions
               \b array \s* \(
            |  [()] 
          )
     )
    
     # Subroutines
     # ---------------
    
     (?(DEFINE)
    
          # core
          (?<core>
               (?>
                    (?&content) 
                 |  
                    (?> \b array \s* \( )
                    # recurse core of  array()
                    (?:
                         (?= . )
                         (?&core) 
                      |  
                    )
                    \)
                 |  
                    \(
                    # recurse core of any  ()
                    (?:
                         (?= . )
                         (?&core) 
                      |  
                    )
                    \)
               )+
          )
    
          # content 
          (?<content>
               (?>
                    (?!
                         \b array \s* \(
                      |  [()] 
                    )
                    . 
               )+
          )
     )
    

    Output

     **  Grp 0           -  ( pos 0 , len 11 ) 
    some_var =   
     **  Grp 1           -  ( pos 0 , len 11 ) 
    some_var =   
     **  Grp 2           -  NULL 
     **  Grp 3           -  NULL 
     **  Grp 4 [core]    -  NULL 
     **  Grp 5 [content] -  NULL 
    
    -----------------------
    
     **  Grp 0           -  ( pos 11 , len 153 ) 
    array(
        'id' => nextId(),
        'profile' => array(
           'name' => 'Hugo Hurley',
           'numbers' => (4 + 8 + 15 + 16 + 23 + 42) / 108
        ) 
    )  
     **  Grp 1           -  NULL 
     **  Grp 2           -  ( pos 17 , len 146 ) 
    
        'id' => nextId(),
        'profile' => array(
           'name' => 'Hugo Hurley',
           'numbers' => (4 + 8 + 15 + 16 + 23 + 42) / 108
        ) 
    
     **  Grp 3           -  NULL 
     **  Grp 4 [core]    -  NULL 
     **  Grp 5 [content] -  NULL 
    
    -------------------------------------
    
     **  Grp 0           -  ( pos 164 , len 3 ) 
    ;
    
     **  Grp 1           -  ( pos 164 , len 3 ) 
    ;
    
     **  Grp 2           -  NULL 
     **  Grp 3           -  NULL 
     **  Grp 4 [core]    -  NULL 
     **  Grp 5 [content] -  NULL 
    

    A previous incarnation of something else, to get an idea of usage

     # Perl code:
     # 
     #     use strict;
     #     use warnings;
     #     
     #     use Data::Dumper;
     #     
     #     $/ = undef;
     #     my $content = <DATA>;
     #     
     #     # Set the error mode on/off here ..
     #     my $BailOnError = 1;
     #     my $IsError = 0;
     #     
     #     my $href = {};
     #     
     #     ParseCore( $href, $content );
     #     
     #     #print Dumper($href);
     #     
     #     print "
    
    ";
     #     print "
    Base======================
    ";
     #     print $href->{content};
     #     print "
    First======================
    ";
     #     print $href->{first}->{content};
     #     print "
    Second======================
    ";
     #     print $href->{first}->{second}->{content};
     #     print "
    Third======================
    ";
     #     print $href->{first}->{second}->{third}->{content};
     #     print "
    Fourth======================
    ";
     #     print $href->{first}->{second}->{third}->{fourth}->{content};
     #     print "
    Fifth======================
    ";
     #     print $href->{first}->{second}->{third}->{fourth}->{fifth}->{content};
     #     print "
    Six======================
    ";
     #     print $href->{six}->{content};
     #     print "
    Seven======================
    ";
     #     print $href->{six}->{seven}->{content};
     #     print "
    Eight======================
    ";
     #     print $href->{six}->{seven}->{eight}->{content};
     #     
     #     exit;
     #     
     #     
     #     sub ParseCore
     #     {
     #         my ($aref, $core) = @_;
     #         my ($k, $v);
     #         while ( $core =~ /(?is)(?:((?&content))|(?><!--block:(.*?)-->)((?&core)|)<!--endblock-->|(<!--(?:block:.*?|endblock)-->))(?(DEFINE)(?<core>(?>(?&content)|(?><!--block:.*?-->)(?:(?&core)|)<!--endblock-->)+)(?<content>(?>(?!<!--(?:block:.*?|endblock)-->).)+))/g )
     #         {
     #            if (defined $1)
     #            {
     #              # CONTENT
     #                $aref->{content} .= $1;
     #            }
     #            elsif (defined $2)
     #            {
     #              # CORE
     #                $k = $2; $v = $3;
     #                $aref->{$k} = {};
     #      #         $aref->{$k}->{content} = $v;
     #      #         $aref->{$k}->{match} = $&;
     #                
     #                my $curraref = $aref->{$k};
     #                my $ret = ParseCore($aref->{$k}, $v);
     #                if ( $BailOnError && $IsError ) {
     #                    last;
     #                }
     #                if (defined $ret) {
     #                    $curraref->{'#next'} = $ret;
     #                }
     #            }
     #            else
     #            {
     #              # ERRORS
     #                print "Unbalanced '$4' at position = ", $-[0];
     #                $IsError = 1;
     #     
     #                # Decide to continue here ..
     #                # If BailOnError is set, just unwind recursion. 
     #                # -------------------------------------------------
     #                if ( $BailOnError ) {
     #                   last;
     #                }
     #            }
     #         }
     #         return $k;
     #     }
     #     
     #     #================================================
     #     __DATA__
     #     some html content here top base
     #     <!--block:first-->
     #         <table border="1" style="color:red;">
     #         <tr class="lines">
     #             <td align="left" valign="<--valign-->">
     #         <b>bold</b><a href="http://www.mewsoft.com">mewsoft</a>
     #         <!--hello--> <--again--><!--world-->
     #         some html content here 1 top
     #         <!--block:second-->
     #             some html content here 2 top
     #             <!--block:third-->
     #                 some html content here 3 top
     #                 <!--block:fourth-->
     #                     some html content here 4 top
     #                     <!--block:fifth-->
     #                         some html content here 5a
     #                         some html content here 5b
     #                     <!--endblock-->
     #                 <!--endblock-->
     #                 some html content here 3a
     #                 some html content here 3b
     #             <!--endblock-->
     #             some html content here 2 bottom
     #         <!--endblock-->
     #         some html content here 1 bottom
     #     <!--endblock-->
     #     some html content here1-5 bottom base
     #     
     #     some html content here 6-8 top base
     #     <!--block:six-->
     #         some html content here 6 top
     #         <!--block:seven-->
     #             some html content here 7 top
     #             <!--block:eight-->
     #                 some html content here 8a
     #                 some html content here 8b
     #             <!--endblock-->
     #             some html content here 7 bottom
     #         <!--endblock-->
     #         some html content here 6 bottom
     #     <!--endblock-->
     #     some html content here 6-8 bottom base
     # 
     # Output >>
     # 
     #     Base======================
     #     some html content here top base
     #     
     #     some html content here1-5 bottom base
     #     
     #     some html content here 6-8 top base
     #     
     #     some html content here 6-8 bottom base
     #     
     #     First======================
     #     
     #         <table border="1" style="color:red;">
     #         <tr class="lines">
     #             <td align="left" valign="<--valign-->">
     #         <b>bold</b><a href="http://www.mewsoft.com">mewsoft</a>
     #         <!--hello--> <--again--><!--world-->
     #         some html content here 1 top
     #         
     #         some html content here 1 bottom
     #     
     #     Second======================
     #     
     #             some html content here 2 top
     #             
     #             some html content here 2 bottom
     #         
     #     Third======================
     #     
     #                 some html content here 3 top
     #                 
     #                 some html content here 3a
     #                 some html content here 3b
     #             
     #     Fourth======================
     #     
     #                     some html content here 4 top
     #                     
     #                 
     #     Fifth======================
     #     
     #                         some html content here 5a
     #                         some html content here 5b
     #                     
     #     Six======================
     #     
     #         some html content here 6 top
     #         
     #         some html content here 6 bottom
     #     
     #     Seven======================
     #     
     #             some html content here 7 top
     #             
     #             some html content here 7 bottom
     #         
     #     Eight======================
     #     
     #                 some html content here 8a
     #                 some html content here 8b
     #         
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥20 @microsoft/fetch-event-source 流式响应问题
  • ¥15 ogg dd trandata 报错
  • ¥15 高缺失率数据如何选择填充方式
  • ¥50 potsgresql15备份问题
  • ¥15 Mac系统vs code使用phpstudy如何配置debug来调试php
  • ¥15 目前主流的音乐软件,像网易云音乐,QQ音乐他们的前端和后台部分是用的什么技术实现的?求解!
  • ¥60 pb数据库修改与连接
  • ¥15 spss统计中二分类变量和有序变量的相关性分析可以用kendall相关分析吗?
  • ¥15 拟通过pc下指令到安卓系统,如果追求响应速度,尽可能无延迟,是不是用安卓模拟器会优于实体的安卓手机?如果是,可以快多少毫秒?
  • ¥20 神经网络Sequential name=sequential, built=False