drs7798 2011-04-12 22:43
浏览 87
已采纳

方法(例如通过bash脚本)使用单引号字符串将当前使用常量的php数组索引转换为数组索引?

I have a huge pile of php scripts with lots of constants being used in place of proper single-quoted array strings.

For example:

$row_rsCatalogsItems[Name]

(bad)

instead of

$row_rsCatalogsItems['Name']

(good)

How would I create a script (bash, php, whatever is most usable) that I can run on scripts to convert them to the more sensible method?

Ideally it wouldn't just match the [something], but also the $variable_name[someIndex].

I'm actually wondering whether it's even viable considering the potential to screw up the interiors of strings or html... (maybe if I just use single quotes, it won't matter because they're interpolated anyway...)

  • 写回答

3条回答 默认 最新

  • dongyan5815 2011-04-12 23:06
    关注

    This sounds like a job for the Tokenizer!

    You can fetch all of the parsed tokens from a PHP source file using token_get_all. You can then go through the resulting array, evaluating each token one at a time. The token name comes back as a number you can look up using token_name.

    A small demo at the PHP interactive prompt:

    php > $str = '<?php echo $face[fire]; echo $face[\'fire\']; ?>';
    php > $t = token_get_all($str);
    php > foreach($t as $i => $j) { if(is_array($j)) $t[$i][0] = token_name($j[0]); }
    

    And here's the output in a different code block, as it's a bit tall and it'll be good to reference the source string while scrolling through it.

    php > print_r($t);
    Array
    (
        [0] => Array
            (
                [0] => T_OPEN_TAG
                [1] => <?php
                [2] => 1
            )
    
        [1] => Array
            (
                [0] => T_ECHO
                [1] => echo
                [2] => 1
            )
    
        [2] => Array
            (
                [0] => T_WHITESPACE
                [1] =>
                [2] => 1
            )
    
        [3] => Array
            (
                [0] => T_VARIABLE
                [1] => $face
                [2] => 1
            )
    
        [4] => [
        [5] => Array
            (
                [0] => T_STRING
                [1] => fire
                [2] => 1
            )
    
        [6] => ]
        [7] => ;
        [8] => Array
            (
                [0] => T_WHITESPACE
                [1] =>
                [2] => 1
            )
    
        [9] => Array
            (
                [0] => T_ECHO
                [1] => echo
                [2] => 1
            )
    
        [10] => Array
            (
                [0] => T_WHITESPACE
                [1] =>
                [2] => 1
            )
    
        [11] => Array
            (
                [0] => T_VARIABLE
                [1] => $face
                [2] => 1
            )
    
        [12] => [
        [13] => Array
            (
                [0] => T_CONSTANT_ENCAPSED_STRING
                [1] => 'fire'
                [2] => 1
            )
    
        [14] => ]
        [15] => ;
        [16] => Array
            (
                [0] => T_WHITESPACE
                [1] =>
                [2] => 1
            )
    
        [17] => Array
            (
                [0] => T_CLOSE_TAG
                [1] => ?>
                [2] => 1
            )
    
    )
    

    As you can see, our evil array indexes are a T_VARIABLE followed by an open bracket, then a T_STRING that is not quoted. Single-quoted indexes come through as T_CONSTANT_ENCAPSED_STRING, quotes and all.

    With this knowledge in hand, you can go through the list of tokens and actually rewrite the source to eliminate all of the unquoted array indexes -- most of them should be pretty obvious. You can simply add single quotes around the string when you write the file back out.

    Just keep in mind that you'll want to not quote any numeric indexes, as that will surely have undesirable side-effects.

    Also keep in mind that expressions are legal inside of indexes:

    $pathological[ some_function('Oh gods', 'why me!?') . '4500' ] = 'Teh bad.';
    

    You'll have a teeny tiny, slightly harder time dealing with these with an automated tool. By which I mean trying to handle them may cause you to fly into a murderous rage. I suggest only trying to fix the constant/string problem now. If done correctly, you should be able to get the Notice count down to a more manageable level.

    (Also note that the Tokenizer deals with the curly string syntax as an actual token, T_CURLY_OPEN -- this should make those pesky inlined array indexes easier to deal with. Here's the list of all tokens once again, just in case you missed it.)

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 ads仿真结果在圆图上是怎么读数的
  • ¥20 Cotex M3的调试和程序执行方式是什么样的?
  • ¥20 java项目连接sqlserver时报ssl相关错误
  • ¥15 一道python难题3
  • ¥15 用matlab 设计一个不动点迭代法求解非线性方程组的代码
  • ¥15 牛顿斯科特系数表表示
  • ¥15 arduino 步进电机
  • ¥20 程序进入HardFault_Handler
  • ¥15 oracle集群安装出bug
  • ¥15 关于#python#的问题:自动化测试