dowb58485 2016-02-23 15:31
浏览 36
已采纳

PHP使用模式读取txt文件并保留信息

Sorry for the confusing title, but I can't think of another one.

I have a text-file in this format (just a few lines taken out of context):

# Google_Product_Taxonomy_Version: 2015-02-19
1 - Animals & Pet Supplies
3237 - Animals & Pet Supplies > Live Animals
2 - Animals & Pet Supplies > Pet Supplies
3 - Animals & Pet Supplies > Pet Supplies > Bird Supplies
7385 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories
499954 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories > Bird Cage Bird Baths
7386 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories > Bird Cage Food & Water Dishes
4989 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cages & Stands
4990 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Food

So far, so good. I want to write a parser, which contains all the information for each category. After the work is done, it has to be written in a mysql-DB.

There are exactly:

1 unique ID
1 Main-category 
n sub-categories

The tricky part (for me) is, how to keep those information and save them in an array, with an aspect on the performance.

My DB must have a final output like this

ID    | parent | title | 
1     |        | Animals & Pet Supplies
3232  |   1    | Live Animals
2     |   1    | Pet Supplies
3     |   2    | Bird Supplies

In fact, I must be able to reproduce this "crumb" pure by my DB-entries.

I started with my parser like this:

public function enrichTaxonomy()
{
    $aOutput = array();

    // ignore first line
    fgets($handle);

    // iterate throug it
    while (($line = fgets($handle)) !== false)
    {
        $splitted = explode("-", $line);

        // build first level
        if (strpos($splitted[1], '>') === false)
        {
            $aOutput['id'][] = trim($splitted[0]);
            $aOutput['title'][] = trim($splitted[1]);
        } else
        {
            // recursive?
            if (substr_count($splitted[1], " > ") == 1)
            {
                $splitted2ndLevel = explode(" > ", $splitted[1]);
                $aOutput['id'][] = trim($splitted[0]);
                $aOutput['title'][] = trim($splitted2ndLevel[1]);
            }
        }
    }

    echo "<pre>";
    var_dump($aOutput);
    echo "</pre>";
}

But I realized, that this isn't a very good way, since my next step would have been:

if (substr_count($splitted[1], " > ") == 2)
{
    $splitted3rdLevel = explode(" > ", $splitted[1]);
    $aOutput['id'][] = trim($splitted[0]);
    $aOutput['title'][] = trim($splitted3rdLevel[2]);
}

if (substr_count($splitted[1], " > ") == 3)
{
    $splitted4thLevel = explode(" > ", $splitted[1]);
    $aOutput['id'][] = trim($splitted[0]);
    $aOutput['title'][] = trim($splitted4thLevel[3]);
}

Also, this seems to be very complicated afterwards, when I try to have a final array, which I may then iterate trough to insert this data in my DB.

An important note is, that each "subcategory" has to know its "father", so I can insert the "parent"-id as well.

My question now: What is a good, short (in relation), performant way to achieve this?

  • 写回答

2条回答 默认 最新

  • doulangxun7769 2016-02-23 16:42
    关注

    No need to build a tree structure when you will need to flatten it again to insert into the database, instead create the same structure as the db:

    $input = <<<'EOD'
    1 - Animals & Pet Supplies
    3237 - Animals & Pet Supplies > Live Animals
    2 - Animals & Pet Supplies > Pet Supplies
    3 - Animals & Pet Supplies > Pet Supplies > Bird Supplies
    7385 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories
    499954 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories > Bird Cage Bird Baths
    7386 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories > Bird Cage Food & Water Dishes
    4989 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cages & Stands
    4990 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Food
    EOD;
    
    $dbInput=[];
    
    $lines = explode("
    ", $input);
    //or for a file, $lines = file('file.path', FILE_IGNORE_NEW_LINES);
    
    foreach($lines as $line){
        if(substr($line, 0, 1) == '#') continue;
    
        list($id, $crumb) = explode('-', $line);
        $id = trim($id);
        $crumb_parts = array_map('trim',explode('>', $crumb));
        $title = array_pop($crumb_parts);
        $parent = array_pop($crumb_parts);
        $parent_id = isset($dbInput[$parent])? $dbInput[$parent][':id'] : null;
    
        $dbInput[$title] = [
            ':id'       =>  $id,
            ':parent'   =>  $parent_id,
            ':title'    =>  $title,
        ];
    }
    $pdo = new PDO('mysql:host=localhost;dbname=dbname','usr','pass');
    
    $sth = $pdo->prepare("INSERT INTO tree (id, parent, title) VALUES (:id, :parent, :title)");
    foreach($dbInput as &$input){
        $sth->execute($input);
    }
    echo 'done';
    

    enter image description here

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥30 这是哪个作者做的宝宝起名网站
  • ¥60 版本过低apk如何修改可以兼容新的安卓系统
  • ¥25 由IPR导致的DRIVER_POWER_STATE_FAILURE蓝屏
  • ¥50 有数据,怎么建立模型求影响全要素生产率的因素
  • ¥50 有数据,怎么用matlab求全要素生产率
  • ¥15 TI的insta-spin例程
  • ¥15 完成下列问题完成下列问题
  • ¥15 C#算法问题, 不知道怎么处理这个数据的转换
  • ¥15 YoloV5 第三方库的版本对照问题
  • ¥15 请完成下列相关问题!