dongyinzhi4689 2009-12-11 22:52
浏览 33

将平面文件数据库信息解析为多维数组

I want to make a class for parsing flat-file database information into one large analogous multidimensional array. I had the idea of formatting the database in a sort of python-esque format as follows:

"tree #1":
    "key" "value"
    "sub-tree #1":
        "key" "value"
        "key #2" "value"
        "key #3" "value"

I am trying to make it parse this and build and array while parsing it to throw the keys/values into, and I want it to be very dynamic and expandable. I've tried many different techniques and I've been stumped in each of these attempts. This is my most recent:

function parse($file=null) {
    $file = $file ? $file : $this->dbfile;

    ### character variables

    # get values of 
    $src = file_get_contents($file);
    # current character number
    $p = 0;

    ### array variables

    # temp shit
    $a = array();
    # set $ln keys
    $ln = array("q"=>0,"k"=>null,"v"=>null,"s"=>null,"p"=>null);
    # indent level
    $ilvl = 0;

    ### go time

    while (strlen($src) > $p) {
        $chr = $src[$p];
        # quote
        if ($chr == "\"") {
            if ($ln["q"] == 1) { // quote open?
                $ln["q"] = 0; // close it
                if (!$ln["k"]) { // key yet?
                    $ln["k"] = $ln["s"]; // set key
                    $ln["s"] = null;
                    $a[$ln["k"]] = $ln["v"]; // write to current array
                } else { // value time
                    $ln["v"] = $ln["s"]; // set value
                    $ln["s"] = null;
                }
            } else {
                $ln["q"] = 1; // open quote
            }
        }

        elseif ($chr == "
" && $ln["q"] == 0) {
            $ln = array("q"=>0,"k"=>null,"v"=>null,"s"=>null,"p"=>null);
            $llvl = $ilvl;

        }
        # beginning of subset
        elseif ($chr == ":" && $ln["q"] == 0) {
            $ilvl++;
            if (!array_key_exists($ilvl,$a)) { $a[$ilvl] = array(); }
            $a[$ilvl][$ln["k"]] = array("@mbdb-parent"=> $ilvl-1 .":".$ln["k"]);
            $ln = array("q"=>0,"k"=>null,"v"=>null,"s"=>null,"p"=>null);
            $this->debug("INDENT++",$ilvl);
        }
        # end of subset
        elseif ($chr == "}") {
            $ilvl--;
            $this->debug("INDENT--",$ilvl);
        }
        # other characters
        else {
            if ($ln["q"] == 1) {
                $ln["s"] .= $chr;
            } else {
                # error
            }
        }
        $p++;
    }
    var_dump($a);
}

I honestly have no idea where to go from here. The thing troubling me most is setting the multidimensional values like $this->c["main"]["sub"]["etc"] the way I have it here. Can it even be done? How can I actually nest the arrays as the data is nested in the db file?

  • 写回答

2条回答 默认 最新

  • douyin8813 2009-12-11 23:35
    关注

    Well, you could use serialize and unserialize but that would be no fun, right? You should be using formats specifically designed for this purpose, but for sake of exercise, I'll try and see what I can come up with.

    There seems to be two kinds of datatypes in your flatfile, key-value pairs and arrays. key-value pairs are denoted with two sets of quotes and arrays with one pair of quotes and a following colon. As you go through the file, you must parse each row and determine what it represents. That's easy with regular expressions. The hard part is to keep track of the level we're going at and act accordingly. Here's a function that parses the tree you provided:

    function parse_flatfile($filename) {
        $file = file($filename);
    
        $result = array();
        $open = false;
        foreach($file as $row) {
            $level = strlen($row) - strlen(ltrim($row));
            $row = rtrim($row);
            // Regular expression to catch key-value pairs
            $isKeyValue = preg_match('/"(.*?)" "(.*?)"$/', $row, $match);        
            if($isKeyValue == 1) {
                if($open && $open['level'] < $level) {
                    $open['item'][$match[1]] = $match[2];
                } else {
                    $open = array('level' => $level - 1, 'item' => &$open['parent']);                
                    if($open) {
                        $open['item'][$match[1]] = $match[2];
                    } else {
                        $result[$match[1]] = $match[2];
                    }
                }
            // Regular expression to catch arrays
            } elseif(($isArray = preg_match('/"(.*?)":$/', $row, $match)) > 0) {
                if($open && $open['level'] < $level) {
                    $open['item'][$match[1]] = array();
                    $open = array('level' => $level, 'item' => &$open['item'][$match[1]], 'parent' => &$open['item']);
                } else {
                    $result[$match[1]] = array();
                    $open = array('level' => $level, 'item' => &$result[$match[1]], 'parent' => false);
                }
            }    
        }    
        return $result;
    }
    

    I won't go into greater detail on how that works, but it short, as we progress deeper into the array, the previous level is stored in a reference $open and so on. Here's a more complex tree using your notation:

    "tree_1":
        "key" "value"
        "sub_tree_1":
            "key" "value"
            "key_2" "value"
            "key_3" "value"
        "key_4" "value"
        "key_5" "value"
    "tree_2":
       "key_6" "value"
        "sub_tree_2":
            "sub_tree_3":
                "sub_tree_4":
                    "key_6" "value"
                    "key_7" "value"
                    "key_8" "value"
                    "key_9" "value"
                    "key_10" "value"
    

    And to parse that file you could use:

    $result = parse_flatfile('flat.txt');
    print_r($result);
    

    And that would output:

    Array
    (
    [tree_1] => Array
        (
        [key] => value
        [sub_tree_1] => Array
            (
            [key] => value
            [key_2] => value
            [key_3] => value
            )    
        [key_4] => value
        [key_5] => value
        )    
    [tree_2] => Array
        (
        [key_6] => value
        [sub_tree_2] => Array
            (
            [sub_tree_3] => Array
                (
                [sub_tree_4] => Array
                    (
                    [key_6] => value
                    [key_7] => value
                    [key_8] => value
                    [key_9] => value
                    [key_10] => value
                    )    
                )    
            )    
        )    
    )
    

    I guess my test file covers all the bases, and it should work without breaking. But I won't give any guarantees.

    Transforming a multidimensional array to flatfile using this notation will be left as an exercise to the reader :)

    评论

报告相同问题?

悬赏问题

  • ¥60 求一个简单的网页(标签-安全|关键词-上传)
  • ¥35 lstm时间序列共享单车预测,loss值优化,参数优化算法
  • ¥15 基于卷积神经网络的声纹识别
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图