drs7798 2011-04-12 22:43
浏览 87
已采纳

方法(例如通过bash脚本)使用单引号字符串将当前使用常量的php数组索引转换为数组索引?

I have a huge pile of php scripts with lots of constants being used in place of proper single-quoted array strings.

For example:

$row_rsCatalogsItems[Name]

(bad)

instead of

$row_rsCatalogsItems['Name']

(good)

How would I create a script (bash, php, whatever is most usable) that I can run on scripts to convert them to the more sensible method?

Ideally it wouldn't just match the [something], but also the $variable_name[someIndex].

I'm actually wondering whether it's even viable considering the potential to screw up the interiors of strings or html... (maybe if I just use single quotes, it won't matter because they're interpolated anyway...)

  • 写回答

3条回答 默认 最新

  • dongyan5815 2011-04-12 23:06
    关注

    This sounds like a job for the Tokenizer!

    You can fetch all of the parsed tokens from a PHP source file using token_get_all. You can then go through the resulting array, evaluating each token one at a time. The token name comes back as a number you can look up using token_name.

    A small demo at the PHP interactive prompt:

    php > $str = '<?php echo $face[fire]; echo $face[\'fire\']; ?>';
    php > $t = token_get_all($str);
    php > foreach($t as $i => $j) { if(is_array($j)) $t[$i][0] = token_name($j[0]); }
    

    And here's the output in a different code block, as it's a bit tall and it'll be good to reference the source string while scrolling through it.

    php > print_r($t);
    Array
    (
        [0] => Array
            (
                [0] => T_OPEN_TAG
                [1] => <?php
                [2] => 1
            )
    
        [1] => Array
            (
                [0] => T_ECHO
                [1] => echo
                [2] => 1
            )
    
        [2] => Array
            (
                [0] => T_WHITESPACE
                [1] =>
                [2] => 1
            )
    
        [3] => Array
            (
                [0] => T_VARIABLE
                [1] => $face
                [2] => 1
            )
    
        [4] => [
        [5] => Array
            (
                [0] => T_STRING
                [1] => fire
                [2] => 1
            )
    
        [6] => ]
        [7] => ;
        [8] => Array
            (
                [0] => T_WHITESPACE
                [1] =>
                [2] => 1
            )
    
        [9] => Array
            (
                [0] => T_ECHO
                [1] => echo
                [2] => 1
            )
    
        [10] => Array
            (
                [0] => T_WHITESPACE
                [1] =>
                [2] => 1
            )
    
        [11] => Array
            (
                [0] => T_VARIABLE
                [1] => $face
                [2] => 1
            )
    
        [12] => [
        [13] => Array
            (
                [0] => T_CONSTANT_ENCAPSED_STRING
                [1] => 'fire'
                [2] => 1
            )
    
        [14] => ]
        [15] => ;
        [16] => Array
            (
                [0] => T_WHITESPACE
                [1] =>
                [2] => 1
            )
    
        [17] => Array
            (
                [0] => T_CLOSE_TAG
                [1] => ?>
                [2] => 1
            )
    
    )
    

    As you can see, our evil array indexes are a T_VARIABLE followed by an open bracket, then a T_STRING that is not quoted. Single-quoted indexes come through as T_CONSTANT_ENCAPSED_STRING, quotes and all.

    With this knowledge in hand, you can go through the list of tokens and actually rewrite the source to eliminate all of the unquoted array indexes -- most of them should be pretty obvious. You can simply add single quotes around the string when you write the file back out.

    Just keep in mind that you'll want to not quote any numeric indexes, as that will surely have undesirable side-effects.

    Also keep in mind that expressions are legal inside of indexes:

    $pathological[ some_function('Oh gods', 'why me!?') . '4500' ] = 'Teh bad.';
    

    You'll have a teeny tiny, slightly harder time dealing with these with an automated tool. By which I mean trying to handle them may cause you to fly into a murderous rage. I suggest only trying to fix the constant/string problem now. If done correctly, you should be able to get the Notice count down to a more manageable level.

    (Also note that the Tokenizer deals with the curly string syntax as an actual token, T_CURLY_OPEN -- this should make those pesky inlined array indexes easier to deal with. Here's the list of all tokens once again, just in case you missed it.)

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 如何让企业微信机器人实现消息汇总整合
  • ¥50 关于#ui#的问题:做yolov8的ui界面出现的问题
  • ¥15 如何用Python爬取各高校教师公开的教育和工作经历
  • ¥15 TLE9879QXA40 电机驱动
  • ¥20 对于工程问题的非线性数学模型进行线性化
  • ¥15 Mirare PLUS 进行密钥认证?(详解)
  • ¥15 物体双站RCS和其组成阵列后的双站RCS关系验证
  • ¥20 想用ollama做一个自己的AI数据库
  • ¥15 关于qualoth编辑及缝合服装领子的问题解决方案探寻
  • ¥15 请问怎么才能复现这样的图呀