doushan1850 2011-10-28 19:35
浏览 16
已采纳

XML解析难题

UPDATE: I've reworked the question, to show progress I've made, and maybe make it easier to answer.

UPDATE 2: I've added another value to the XML. Extension available in each zip. Each item can have multiple items separated by a tab. So it will be structured like this. Platform > Extension (Sub Group) > Name > Title. If the item has more than one extension then it will appear in multiple places.

I have the following XML file.

<Item>
    <Platform>Windows</Platform>
    <Ext>gif    jpeg    doc</Ext>
    <Name>File Group 1</Name>
    <Title>This is the first file group</Title>
    <DownloadPath>/this/windows/1/1.zip</DownloadPath>
</Item>
<Item>
    <Platform>Windows</Platform>
    <Ext>gif    doc</Ext>
    <Name>File Group 1</Name>
    <Title>This is the first file group</Title>
    <DownloadPath>/this/windows/1/2.zip</DownloadPath>
</Item>
<Item>
    <Platform>Windows</Platform>
    <Ext>gif</Ext>
    <Name>File Group 1</Name>
    <Title>This is in the same group but has a different title</Title>
    <DownloadPath>/this/windows/1/3.zip</DownloadPath>
</Item>
<Item>
    <Platform>Mac</Platform>
    <Ext>gif    jpeg    doc</Ext>
    <Name>File Group 1</Name>
    <Title>This has the same group name but a different platform. Because it has the same title and name the files are added to this array below.</Title>
    <DownloadPath>/this/mac/1/1.zip</DownloadPath>
</Item>
<Item>
    <Platform>Mac</Platform>
    <Ext>jpeg   doc</Ext>
    <Name>File Group 1</Name>
    <Title>This has the same group name but a different platform. Because it has the same title and name the files are added to this array below.</Title>
    <DownloadPath>/this/mac/1/2.zip</DownloadPath>
</Item>
<Item>
    <Platform>Windows</Platform>
    <Ext>gif    jpeg    doc</Ext>
    <Name>File Group 2</Name>
    <Title>This is the second file group</Title>
    <DownloadPath>/this/windows/2/1.zip</DownloadPath>
</Item>
<Item>
    <Platform>Windows</Platform>
    <Ext>gif    jpeg    doc</Ext>
    <Name>File Group 2</Name>
    <Title>This is the second file group</Title>
    <DownloadPath>/this/windows/2/2.zip</DownloadPath>
</Item>
<Item>
    <Platform>Mac</Platform>
    <Ext>gif    jpeg    doc</Ext>
    <Name>File Group 3</Name>
    <Title>This is the second mac file group really.</Title>
    <DownloadPath>/this/windows/3/1.zip</DownloadPath>
</Item>

I want to be able to go through it and sort it so I can insert it into a normalized table schema. Here is the format I would like the array to built.

[Windows] => Array (
    [0] => array(
        "Name" => "File Group 1",
        "Title" => "This is the first file group",
        "Files" => array(
            [0] => array(
                "DownloadPath" => "/this/windows/1/1.zip"
            ),
            [1] => array(
                "DownloadPath" => "/this/windows/1/2.zip"
            )
        )
    ),
    [1] => array(
        "Name" => "File Group 1",
        "Title" => "This has the same name but has a different title, so it should be seperate.",
        "Files" => array(
            [0] => array(
                "DownloadPath" => "/this/windows/1/3.zip"
            )
        )
    ),
    [1] => array(
        "Name" => "File Group 2",
        "Title" => "This is the second file group",
        "Files" => array(
            [0] => array(
                "DownloadPath" => "/this/windows/2/1.zip"
            ),
            [1] => array(
                "DownloadPath" => "/this/windows/2/2.zip"
            )
        )
    )
),
[Mac] => Array(
    [0] => array(
        "Name" => "File Group 1",
        "Title" => "This has the same group name but a different platform. Because it has the same title and name the files are added to this array below.",
        "Files" => array(
            [0] => array(
                "DownloadPath" => "/this/mac/1/1.zip"
            ),
            [1] => array(
                "DownloadPath" => "/this/mac/1/2.zip"
            )
        )
    ),
    [1] => array(
        "Name" => "File Group 3",
        "Title" => "This is the second mac file group really.",
        "Files" => array(
            [0] => array(
                "DownloadPath" => "/this/mac/1/1.zip"
            ),
            [1] => array(
                "DownloadPath" => "/this/mac/1/2.zip"
            )
        )
    ),
)

Here is what I've got so far with my php

    $scrape_xml = "files.xml";
    $xml = simplexml_load_file($scrape_xml);

$groups = array();

foreach ($xml->Item as $file){

            if (!isset($groups[stripslashes($file->Platform)][stripslashes($file->Name)][stripslashes($file->Title)])){

                $groups[stripslashes($file->Platform)][stripslashes($file->Name)][stripslashes($file->Title)] = array(
                    'Platform' => $file->Platform,
                    'Name' => $file->Name,
                    'Title' => $file->Title
                );

            }

   $groups[stripslashes($file->Platform)][stripslashes($file->Name)][stripslashes($file->Title)]['Files'][] = $file->DownloadPath;

}

echo "count=" . $i;

echo "<pre>";
print_r($groups);
echo "</pre>";

it gives me this result

Array
(
    [Windows] => Array
        (
            [File Group 1] => Array
                (
                    [This is the first file group] => Array
                        (
                            [Platform] => SimpleXMLElement Object
                                (
                                    [0] => Windows
                                )

                            [Name] => SimpleXMLElement Object
                                (
                                    [0] => File Group 1
                                )

                            [Title] => SimpleXMLElement Object
                                (
                                    [0] => This is the first file group
                                )

                            [Files] => Array
                                (
                                    [0] => SimpleXMLElement Object
                                        (
                                            [0] => /this/windows/1/1.zip
                                        )

                                    [1] => SimpleXMLElement Object
                                        (
                                            [0] => /this/windows/1/2.zip
                                        )

                                )

                        )

                    [This is in the same group but has a different title] => Array
                        (
                            [Platform] => SimpleXMLElement Object
                                (
                                    [0] => Windows
                                )

                            [Name] => SimpleXMLElement Object
                                (
                                    [0] => File Group 1
                                )

                            [Title] => SimpleXMLElement Object
                                (
                                    [0] => This is in the same group but has a different title
                                )

                            [Files] => Array
                                (
                                    [0] => SimpleXMLElement Object
                                        (
                                            [0] => /this/windows/1/3.zip
                                        )

                                )

                        )

                )

            [File Group 2] => Array
                (
                    [This is the second file group] => Array
                        (
                            [Platform] => SimpleXMLElement Object
                                (
                                    [0] => Windows
                                )

                            [Name] => SimpleXMLElement Object
                                (
                                    [0] => File Group 2
                                )

                            [Title] => SimpleXMLElement Object
                                (
                                    [0] => This is the second file group
                                )

                            [Files] => Array
                                (
                                    [0] => SimpleXMLElement Object
                                        (
                                            [0] => /this/windows/2/1.zip
                                        )

                                    [1] => SimpleXMLElement Object
                                        (
                                            [0] => /this/windows/2/2.zip
                                        )

                                )

                        )

                )

        )

    [Mac] => Array
        (
            [File Group 1] => Array
                (
                    [This has the same group name but a different platform. Because it has the same title and name the files are added to this array below.] => Array
                        (
                            [Platform] => SimpleXMLElement Object
                                (
                                    [0] => Mac
                                )

                            [Name] => SimpleXMLElement Object
                                (
                                    [0] => File Group 1
                                )

                            [Title] => SimpleXMLElement Object
                                (
                                    [0] => This has the same group name but a different platform. Because it has the same title and name the files are added to this array below.
                                )

                            [Files] => Array
                                (
                                    [0] => SimpleXMLElement Object
                                        (
                                            [0] => /this/mac/1/1.zip
                                        )

                                    [1] => SimpleXMLElement Object
                                        (
                                            [0] => /this/mac/1/2.zip
                                        )

                                )

                        )

                )

            [File Group 3] => Array
                (
                    [This is the second mac file group really.] => Array
                        (
                            [Platform] => SimpleXMLElement Object
                                (
                                    [0] => Mac
                                )

                            [Name] => SimpleXMLElement Object
                                (
                                    [0] => File Group 3
                                )

                            [Title] => SimpleXMLElement Object
                                (
                                    [0] => This is the second mac file group really.
                                )

                            [Files] => Array
                                (
                                    [0] => SimpleXMLElement Object
                                        (
                                            [0] => /this/windows/3/1.zip
                                        )

                                )

                        )

                )

        )

)

UPDATE 2: New Array Structure

[Windows] => Array (
    [gif] =>Array(
        [0] => array(
            "Name" => "File Group 1",
            "Title" => "This is the first file group",
            "Files" => array(
                [0] => array(
                    "DownloadPath" => "/this/windows/1/1.zip"
                ),
                [1] => array(
                    "DownloadPath" => "/this/windows/1/2.zip"
                )
            )
        )
    ),
    [jpeg] => array(
        [0] => array(
            "Name" => "File Group 1",
            "Title" => "This is the first file group",
            "Files" => array(
                [0] => array(
                    "DownloadPath" => "/this/windows/1/1.zip"
                ),
                [1] => array(
                    "DownloadPath" => "/this/windows/1/2.zip"
                )
            )
        ),
        [1] => array(
            "Name" => "File Group 2",
            "Title" => "This is the second file group",
            "Files" => array(
                [0] => array(
                    "DownloadPath" => "/this/windows/2/1.zip"
                ),
                [1] => array(
                    "DownloadPath" => "/this/windows/2/2.zip"
                )
            )
        )
    ),
    [doc] => array(
        [0] => array(
            "Name" => "File Group 1",
            "Title" => "This is the first file group",
            "Files" => array(
                [0] => array(
                    "DownloadPath" => "/this/windows/1/1.zip"
                ),
                [1] => array(
                    "DownloadPath" => "/this/windows/1/2.zip"
                )
            )
        ),
        [1] => array(
            "Name" => "File Group 1",
            "Title" => "This has the same name but has a different title, so it should be seperate.",
            "Files" => array(
                [0] => array(
                    "DownloadPath" => "/this/windows/1/3.zip"
                )
            )
        ),
        [2] => array(
            "Name" => "File Group 2",
            "Title" => "This is the second file group",
            "Files" => array(
                [0] => array(
                    "DownloadPath" => "/this/windows/2/1.zip"
                ),
                [1] => array(
                    "DownloadPath" => "/this/windows/2/2.zip"
                )
            )
        )
    )
),
[Mac] => Array(
    [gif] => array(
        [0] => array(
            "Name" => "File Group 2",
            "Title" => "This is the second file group",
            "Files" => array(
                [0] => array(
                    "DownloadPath" => "/this/mac/2/1.zip"
                ),
                [1] => array(
                    "DownloadPath" => "/this/mac/2/2.zip"
                )
            )
        ),
        [1] => array(
            "Name" => "File Group 2",
            "Title" => "This is the second file group",
            "Files" => array(
                [0] => array(
                    "DownloadPath" => "/this/mac/2/1.zip"
                ),
                [1] => array(
                    "DownloadPath" => "/this/mac/2/2.zip"
                )
            )
        ),

    )
    [jepg] => array(
        [0] => array(
            "Name" => "File Group 2",
            "Title" => "This is the second file group",
            "Files" => array(
                [0] => array(
                    "DownloadPath" => "/this/mac/2/1.zip"
                ),
                [1] => array(
                    "DownloadPath" => "/this/mac/2/2.zip"
                )
            )
        )
    )
    [doc] => array(
        [0] => array(
            "Name" => "File Group 1",
            "Title" => "This has the same group name but a different platform. Because it has the same title and name the files are added to this array below.",
            "Files" => array(
                [0] => array(
                    "DownloadPath" => "/this/mac/1/1.zip"
                ),
                [1] => array(
                    "DownloadPath" => "/this/mac/1/2.zip"
                )
            )
        ),
        [1] => array(
            "Name" => "File Group 3",
            "Title" => "This is the second mac file group really.",
            "Files" => array(
                [0] => array(
                    "DownloadPath" => "/this/mac/1/1.zip"
                ),
                [1] => array(
                    "DownloadPath" => "/this/mac/1/2.zip"
                )
            )
        )
    )
)

UPDATE 3: There is some garbage coming through for the file list.

<Item>
        <Platform>Windows</Platform>
        <Ext>gif    jpeg    doc</Ext>
        <Name>File Group 1</Name>
        <Title>This is the first file group</Title>
        <DownloadPath>/this/windows/1/1.zip</DownloadPath>
    </Item>
    <Item>
        <Platform>Windows</Platform>
        <Ext>gif    jpeg    doc</Ext>
        <Name>File Group 1</Name>
        <Title>This is the first file group</Title>
        <DownloadPath>/this/windows/1/2.zip</DownloadPath>
    </Item>
<Item>
        <Platform>Windows</Platform>
        <Ext>gif    jpeg    doc</Ext>
        <Name>File Group 1</Name>
        <Title>This is the first file group</Title>
        <DownloadPath>/this/windows/2/1.zip</DownloadPath>
    </Item>
    <Item>
        <Platform>Windows</Platform>
        <Ext>gif    jpeg    doc</Ext>
        <Name>File Group 1</Name>
        <Title>This is the first file group</Title>
        <DownloadPath>/this/windows/2/2.zip</DownloadPath>
    </Item>

There is a item with the same platform, extensions, name and title. Items 3 and 4 above need to be skipped over and save them to an array that I will handle later.

  • 写回答

7条回答 默认 最新

  • dongyu1979 2011-10-31 17:52
    关注

    You are merely mapping the input values into the output array by arranging them differently, this is your structure:

    Array(
      [... Item/Platform] => Array (
        [... Item/Title as 0-n] => array(
            "Name" => Item/Name,
            "Title" => Item/Title,
            "Files" => array(
                [...] => array(
                    "DownloadPath" => Item/DownloadPath
                ),
            )
        ),
    

    The mapping can be done by iterating over the items within the XML and storing the values into the appropriate place in the new array (I named it $build):

    $build = array();
    foreach($items as $item)
    {
        $platform = (string) $item->Platform;
        $title = (string) $item->Title;
        isset($build[$platform][$title]) ?: $build[$platform][$title] = array(
            'Name' => (string) $item->Name,
            'Title' => $title
        );
        $build[$platform][$title]['Files'][] = array('DownloadPath' => (string) $item->DownloadPath);
    }
    $build = array_map('array_values', $build);
    

    The array_map call is done at the end to convert the Item/Title keys into numerical ones.

    And that's it, here the Demo.

    Let me know if that's helpful.

    Edit: For your updated data, it's a slight modification of the above, the key principles of the previous example still exist, it's additionally taken care of the extra duplication per each additional extension per item, by adding another iteration inside:

    $build = array();
    foreach($items as $item)
    {
        $platform = (string) $item->Platform;
        $title = (string) $item->Title;
        foreach(preg_split("~\s+~", $item->Ext) as $ext)
        {
            isset($build[$platform][$ext][$title])
                ?:$build[$platform][$ext][$title] = array(
                    'Name' => (string) $item->Name,
                    'Title' => $title
                );
            $build[$platform][$ext][$title]['Files'][]
                = array('DownloadPath' => (string) $item->DownloadPath);
        }
    }
    $build = array_map(function($v) {return array_map('array_values', $v);}, $build);
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(6条)

报告相同问题?

悬赏问题

  • ¥15 Mac系统vs code使用phpstudy如何配置debug来调试php
  • ¥15 目前主流的音乐软件,像网易云音乐,QQ音乐他们的前端和后台部分是用的什么技术实现的?求解!
  • ¥60 pb数据库修改与连接
  • ¥15 spss统计中二分类变量和有序变量的相关性分析可以用kendall相关分析吗?
  • ¥15 拟通过pc下指令到安卓系统,如果追求响应速度,尽可能无延迟,是不是用安卓模拟器会优于实体的安卓手机?如果是,可以快多少毫秒?
  • ¥20 神经网络Sequential name=sequential, built=False
  • ¥16 Qphython 用xlrd读取excel报错
  • ¥15 单片机学习顺序问题!!
  • ¥15 ikuai客户端多拨vpn,重启总是有个别重拨不上
  • ¥20 关于#anlogic#sdram#的问题,如何解决?(关键词-performance)