duankuiyuant3940 2009-05-29 10:08
浏览 69
已采纳

使用XMLReader解析媒体RSS

<rss version="2.0"
    xmlns:media="http://search.yahoo.com/mrss/">
    <channel> 
        <title>Title of RSS feed</title> 
        <link>http://www.google.com</link> 
        <description>Details about the feed</description> 
        <pubDate>Mon, 24 Nov 08 21:44:21 -0500</pubDate> 
        <language>en</language> 
        <item> 
            <title>Article 1</title> 
            <description><![CDATA[How to use StackOverflow.com]]></description>
            <link>http://youtube.com/?v=y6_-cLWwEU0</link>
            <media:player url="http://youtube.com/?v=y6_-cLWwEU0"    /> 
            <media:thumbnail url="http://img.youtube.com/vi/y6_-cLWwEU0/default.jpg"
                width="120" height="90" /> 
            <media:title>Jared on StackOverflow</media:title> 
            <media:category label="Tags">tag1,tag2</media:category> 
            <media:credit>Jared</media:credit> 
            <enclosure url="http://youtube.com/v/y6_-cLWwEU0.swf"
                length="233"
                type="application/x-shockwave-flash"/>
        </item>
    </channel>
</rss>

I decided to use XMLReader parsing my large xml files. I am having trouble getting the data inside each item especially the thumbnail

Here's my code

//////////////////////////////

$itemList = array();
$i=0;
$xmlReader = new XMLReader();
$xmlReader->open('XMLFILE');
while($xmlReader->read()) {
    if($xmlReader->nodeType == XMLReader::ELEMENT) {
            if($xmlReader->localName == 'title') {
                    $xmlReader->read(); 
            $itemList[$i]['title'] = $xmlReader->value;
        }
        if($xmlReader->localName == 'description') {
            // move to its textnode / child
            $xmlReader->read(); 
            $itemList[$i]['description'] = $xmlReader->value; 

        } 
            if($xmlReader->localName == 'media:thumbnail') {
            // move to its textnode / child
            $xmlReader->read(); 
            $itemList[$i]['media:thumbnail'] = $xmlReader->value; 
                    $i++;
        }       
    }
}
////////////////

Is it advisable to use DOMXpath since I was parsing huge XML file? I really appreciate your advice.

  • 写回答

1条回答 默认 最新

  • douzhaxian1267 2009-06-02 07:29
    关注

    xtian,

    If memory usage is a concern of yours, I would recommend staying away from DOM/XPath as it requires that the whole file be read into memory first. XMLReader only reads in a chunk at a time (probably 8K as that seems to be the standard PHP Chunk Size).

    I have re-written what you originally posted and it captures the following elements contained within an <item> Element:

    1. title
    2. description
    3. media:thumbnail
    4. media:title

    The thing you have to remember is that XMLReader::localName will return the Element name minus any XMLNS declaration (e.g. media:thumbnail's localName is thumbnail). You will want to be careful of this as the media:title value could overwrite the title value.

    Here is what I re-wrote:

    <?php
    define ('XMLFILE', dirname(__FILE__) . '/Rss.xml');
    echo "<pre>";
    
    $items = array ();
    $i = 0;
    
    $xmlReader = new XMLReader();
    $xmlReader->open (XMLFILE, null, LIBXML_NOBLANKS);
    
    $isParserActive = false;
    $simpleNodeTypes = array ("title", "description", "media:title");
    
    while ($xmlReader->read ())
    {
        $nodeType = $xmlReader->nodeType;
    
        // Only deal with Beginning/Ending Tags
        if ($nodeType != XMLReader::ELEMENT && $nodeType != XMLReader::END_ELEMENT)
        {
            continue;
        }
        else if ($xmlReader->name == "item")
        {
            if (($nodeType == XMLReader::END_ELEMENT) && $isParserActive)
            {
                $i++;
            }
            $isParserActive = ($nodeType != XMLReader::END_ELEMENT);
        }
    
        if (!$isParserActive || $nodeType == XMLReader::END_ELEMENT)
        {
            continue;
        }
    
        $name = $xmlReader->name;
    
        if (in_array ($name, $simpleNodeTypes))
        {
            // Skip to the text node
            $xmlReader->read ();
            $items[$i][$name] = $xmlReader->value;
        }
        else if ($name == "media:thumbnail")
        {
            $items[$i]['media:thumbnail'] = array (
                "url" => $xmlReader->getAttribute("url"),
                "width" => $xmlReader->getAttribute("width"),
                "height" => $xmlReader->getAttribute("height")
            );
        }
    }
    
    var_dump ($items);
    
    echo "</pre>";
    
    ?>
    

    If you have any questions on how this works, I would be more than happy to answer them for you.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 程序不包含适用于入口点的静态Main方法
  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 ubuntu子系统密码忘记