dongyi6668 2016-05-13 16:26
浏览 49
已采纳

RegEx替换嵌套结构中的匹配括号[关闭]

How can I replace a set of matching opening/closing parentheses if the first opening parenthesis follows the keyword array? Can regular expressions help with this type of problem?

In order to be more specific, I'd like to solve this using either JavaScript or PHP

// input
$data = array(
    'id' => nextId(),
    'profile' => array(
       'name' => 'Hugo Hurley',
       'numbers' => (4 + 8 + 15 + 16 + 23 + 42) / 108
    )
);

// desired output
$data = [
    'id' => nextId(),
    'profile' => [
       'name' => 'Hugo Hurley',
       'numbers' => (4 + 8 + 15 + 16 + 23 + 42) / 108
    ]
];
  • 写回答

2条回答 默认 最新

  • drhqkz3455 2016-05-13 17:50
    关注

    Tim Pietzcker gave the Dot-Net counting version.
    It has the same elements as the PCRE (php) version below.

    All the caveats are the same. In particular, non-array parenthesis must
    be balanced because they use the same closing parenthesis as delimiters.

    All text must be parsed (or should be).
    The outer groups 1, 2, 3, 4 allow you to get the parts
    CONTENT
    CORE-1 array()
    CORE-2 any ()
    EXCEPTIONS

    Each match gets you one of these outer things and are mutually exclusive.

    The trick is to define a php function parse( core) that parses the CORE.
    Inside that function is the while (regex.search( core ) { .. } loop.

    Each time either CORE-1 or 2 groups match, call the parse( core ) function passing
    the contents of that core's group to it.

    And inside the loop, just take off content and assign it to the hash.

    Obviously, the group 1 construct which calls (?&content) should be replaced
    with constructs to obtain your hash like variable data.

    On a detailed scale, this can be very tedious.
    Usually, you'd have to account for every single character to correctly
    parse the entire thing.

    (?is)(?:((?&content))|(?>\barray\s*\()((?=.)(?&core)|)\)|\(((?=.)(?&core)|)\)|(\barray\s*\(|[()]))(?(DEFINE)(?<core>(?>(?&content)|(?>\barray\s*\()(?:(?=.)(?&core)|)\)|\((?:(?=.)(?&core)|)\))+)(?<content>(?>(?!\barray\s*\(|[()]).)+))
    

    Expanded

     # 1:  CONTENT
     # 2:  CORE-1
     # 3:  CORE-2
     # 4:  EXCEPTIONS
    
     (?is)
    
     (?:
          (                                  # (1), Take off   CONTENT
               (?&content) 
          )
       |                                   # OR -----------------------------
          (?>                                # Start 'array('
               \b array \s* \(
          )
          (                                  # (2), Take off   'array( CORE-1 )'
               (?= . )
               (?&core) 
            |  
          )
          \)                                 # End ')'
       |                                   # OR -----------------------------
          \(                                 # Start '('
          (                                  # (3), Take off   '( any CORE-2 )'
               (?= . )
               (?&core) 
            |  
          )
          \)                                 # End ')'
       |                                   # OR -----------------------------
          (                                  # (4), Take off   Unbalanced or Exceptions
               \b array \s* \(
            |  [()] 
          )
     )
    
     # Subroutines
     # ---------------
    
     (?(DEFINE)
    
          # core
          (?<core>
               (?>
                    (?&content) 
                 |  
                    (?> \b array \s* \( )
                    # recurse core of  array()
                    (?:
                         (?= . )
                         (?&core) 
                      |  
                    )
                    \)
                 |  
                    \(
                    # recurse core of any  ()
                    (?:
                         (?= . )
                         (?&core) 
                      |  
                    )
                    \)
               )+
          )
    
          # content 
          (?<content>
               (?>
                    (?!
                         \b array \s* \(
                      |  [()] 
                    )
                    . 
               )+
          )
     )
    

    Output

     **  Grp 0           -  ( pos 0 , len 11 ) 
    some_var =   
     **  Grp 1           -  ( pos 0 , len 11 ) 
    some_var =   
     **  Grp 2           -  NULL 
     **  Grp 3           -  NULL 
     **  Grp 4 [core]    -  NULL 
     **  Grp 5 [content] -  NULL 
    
    -----------------------
    
     **  Grp 0           -  ( pos 11 , len 153 ) 
    array(
        'id' => nextId(),
        'profile' => array(
           'name' => 'Hugo Hurley',
           'numbers' => (4 + 8 + 15 + 16 + 23 + 42) / 108
        ) 
    )  
     **  Grp 1           -  NULL 
     **  Grp 2           -  ( pos 17 , len 146 ) 
    
        'id' => nextId(),
        'profile' => array(
           'name' => 'Hugo Hurley',
           'numbers' => (4 + 8 + 15 + 16 + 23 + 42) / 108
        ) 
    
     **  Grp 3           -  NULL 
     **  Grp 4 [core]    -  NULL 
     **  Grp 5 [content] -  NULL 
    
    -------------------------------------
    
     **  Grp 0           -  ( pos 164 , len 3 ) 
    ;
    
     **  Grp 1           -  ( pos 164 , len 3 ) 
    ;
    
     **  Grp 2           -  NULL 
     **  Grp 3           -  NULL 
     **  Grp 4 [core]    -  NULL 
     **  Grp 5 [content] -  NULL 
    

    A previous incarnation of something else, to get an idea of usage

     # Perl code:
     # 
     #     use strict;
     #     use warnings;
     #     
     #     use Data::Dumper;
     #     
     #     $/ = undef;
     #     my $content = <DATA>;
     #     
     #     # Set the error mode on/off here ..
     #     my $BailOnError = 1;
     #     my $IsError = 0;
     #     
     #     my $href = {};
     #     
     #     ParseCore( $href, $content );
     #     
     #     #print Dumper($href);
     #     
     #     print "
    
    ";
     #     print "
    Base======================
    ";
     #     print $href->{content};
     #     print "
    First======================
    ";
     #     print $href->{first}->{content};
     #     print "
    Second======================
    ";
     #     print $href->{first}->{second}->{content};
     #     print "
    Third======================
    ";
     #     print $href->{first}->{second}->{third}->{content};
     #     print "
    Fourth======================
    ";
     #     print $href->{first}->{second}->{third}->{fourth}->{content};
     #     print "
    Fifth======================
    ";
     #     print $href->{first}->{second}->{third}->{fourth}->{fifth}->{content};
     #     print "
    Six======================
    ";
     #     print $href->{six}->{content};
     #     print "
    Seven======================
    ";
     #     print $href->{six}->{seven}->{content};
     #     print "
    Eight======================
    ";
     #     print $href->{six}->{seven}->{eight}->{content};
     #     
     #     exit;
     #     
     #     
     #     sub ParseCore
     #     {
     #         my ($aref, $core) = @_;
     #         my ($k, $v);
     #         while ( $core =~ /(?is)(?:((?&content))|(?><!--block:(.*?)-->)((?&core)|)<!--endblock-->|(<!--(?:block:.*?|endblock)-->))(?(DEFINE)(?<core>(?>(?&content)|(?><!--block:.*?-->)(?:(?&core)|)<!--endblock-->)+)(?<content>(?>(?!<!--(?:block:.*?|endblock)-->).)+))/g )
     #         {
     #            if (defined $1)
     #            {
     #              # CONTENT
     #                $aref->{content} .= $1;
     #            }
     #            elsif (defined $2)
     #            {
     #              # CORE
     #                $k = $2; $v = $3;
     #                $aref->{$k} = {};
     #      #         $aref->{$k}->{content} = $v;
     #      #         $aref->{$k}->{match} = $&;
     #                
     #                my $curraref = $aref->{$k};
     #                my $ret = ParseCore($aref->{$k}, $v);
     #                if ( $BailOnError && $IsError ) {
     #                    last;
     #                }
     #                if (defined $ret) {
     #                    $curraref->{'#next'} = $ret;
     #                }
     #            }
     #            else
     #            {
     #              # ERRORS
     #                print "Unbalanced '$4' at position = ", $-[0];
     #                $IsError = 1;
     #     
     #                # Decide to continue here ..
     #                # If BailOnError is set, just unwind recursion. 
     #                # -------------------------------------------------
     #                if ( $BailOnError ) {
     #                   last;
     #                }
     #            }
     #         }
     #         return $k;
     #     }
     #     
     #     #================================================
     #     __DATA__
     #     some html content here top base
     #     <!--block:first-->
     #         <table border="1" style="color:red;">
     #         <tr class="lines">
     #             <td align="left" valign="<--valign-->">
     #         <b>bold</b><a href="http://www.mewsoft.com">mewsoft</a>
     #         <!--hello--> <--again--><!--world-->
     #         some html content here 1 top
     #         <!--block:second-->
     #             some html content here 2 top
     #             <!--block:third-->
     #                 some html content here 3 top
     #                 <!--block:fourth-->
     #                     some html content here 4 top
     #                     <!--block:fifth-->
     #                         some html content here 5a
     #                         some html content here 5b
     #                     <!--endblock-->
     #                 <!--endblock-->
     #                 some html content here 3a
     #                 some html content here 3b
     #             <!--endblock-->
     #             some html content here 2 bottom
     #         <!--endblock-->
     #         some html content here 1 bottom
     #     <!--endblock-->
     #     some html content here1-5 bottom base
     #     
     #     some html content here 6-8 top base
     #     <!--block:six-->
     #         some html content here 6 top
     #         <!--block:seven-->
     #             some html content here 7 top
     #             <!--block:eight-->
     #                 some html content here 8a
     #                 some html content here 8b
     #             <!--endblock-->
     #             some html content here 7 bottom
     #         <!--endblock-->
     #         some html content here 6 bottom
     #     <!--endblock-->
     #     some html content here 6-8 bottom base
     # 
     # Output >>
     # 
     #     Base======================
     #     some html content here top base
     #     
     #     some html content here1-5 bottom base
     #     
     #     some html content here 6-8 top base
     #     
     #     some html content here 6-8 bottom base
     #     
     #     First======================
     #     
     #         <table border="1" style="color:red;">
     #         <tr class="lines">
     #             <td align="left" valign="<--valign-->">
     #         <b>bold</b><a href="http://www.mewsoft.com">mewsoft</a>
     #         <!--hello--> <--again--><!--world-->
     #         some html content here 1 top
     #         
     #         some html content here 1 bottom
     #     
     #     Second======================
     #     
     #             some html content here 2 top
     #             
     #             some html content here 2 bottom
     #         
     #     Third======================
     #     
     #                 some html content here 3 top
     #                 
     #                 some html content here 3a
     #                 some html content here 3b
     #             
     #     Fourth======================
     #     
     #                     some html content here 4 top
     #                     
     #                 
     #     Fifth======================
     #     
     #                         some html content here 5a
     #                         some html content here 5b
     #                     
     #     Six======================
     #     
     #         some html content here 6 top
     #         
     #         some html content here 6 bottom
     #     
     #     Seven======================
     #     
     #             some html content here 7 top
     #             
     #             some html content here 7 bottom
     #         
     #     Eight======================
     #     
     #                 some html content here 8a
     #                 some html content here 8b
     #         
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 求daily translation(DT)偏差订正方法的代码
  • ¥15 js调用html页面需要隐藏某个按钮
  • ¥15 ads仿真结果在圆图上是怎么读数的
  • ¥20 Cotex M3的调试和程序执行方式是什么样的?
  • ¥20 java项目连接sqlserver时报ssl相关错误
  • ¥15 一道python难题3
  • ¥15 牛顿斯科特系数表表示
  • ¥15 arduino 步进电机
  • ¥20 程序进入HardFault_Handler
  • ¥15 关于#python#的问题:自动化测试