donglao4370
2017-10-29 17:37
浏览 29

如何读取XML文件的php DOM中的分隔符?

I've some XML files and I've to read and convert them in HTML.

The format of the XML is this:

<book pages="2">

    <page n="1" />

    <entry>
        ...
    </entry>
    <entry>
        ...
    </entry>
    <entry>
        ...
    </entry>

    <page n="2" />

    <entry>
        ...
    </entry>
    <entry>
        ...
    </entry>
    <entry>
        ...
    </entry>

    <endpages />

</book>

How i can extract an array of the entries only of a single page?

Thanks in advance!

图片转代码服务由CSDN问答提供 功能建议

我有一些XML文件,我将以HTML格式阅读和转换它们。

XML的格式为:

 &lt; book pages =“2”&gt; 
 
&lt; page n =“1”/  &gt; 
 
&lt; entry&gt; 
 ... 
&lt; / entry&gt; 
&lt; entry&gt; 
 ... 
&lt; / entry&gt; 
&lt; entry&gt; 
 ..  。
&lt; / entry&gt; 
 
&lt; page n =“2”/&gt; 
 
&lt; entry&gt; 
 ... 
&lt; / entry&gt; 
&lt; entry&gt; \  n ... 
&lt; / entry&gt; 
&lt; entry&gt; 
 ... 
&lt; / entry&gt; 
 
&lt; endpages /&gt; 
 
&lt; / book&gt; 
 <  / code>  
 
 

如何只提取单个页面的条目数组?

提前致谢!

  • 写回答
  • 好问题 提建议
  • 关注问题
  • 收藏
  • 邀请回答

2条回答 默认 最新

  • douniang3866 2017-10-29 21:34
    已采纳

    I suggested using XPath for this in my original comment, however, I've been playing around with some XPath expressions for this using a combination of following-sibling and preceding-sibling but I can't get it to work properly with this XML structure.

    A bit of a hacky way of doing this is by just fetching everything after a given page number, and stopping when you find the next <page /> or <endpages /> element:

    $dom = new DOMDocument("1.0", "UTF-8");
    $dom->load($xmlFile);
    
    $xp = new DOMXPath($dom);
    
    $pageNo = 2;
    
    $list = $xp->query("/book/page[@n='" . $pageNo . "']/following-sibling::*");
    
    foreach ($list as $node) {
        if ($node->nodeName == 'page' || $node->nodeName == 'endpages') {
            break;
        }
    
        echo $node->textContent . "<br />"; // <entry /> node
    }
    

    I'm quite sure this will not perform very well if you have a lot of pages in the XML file and you're trying to fetch only the elements of page one, but in terms of lines of code this is overseeable and maybe someone else has some ideas on how to optimise the XPath expression.

    已采纳该答案
    评论
    解决 无用
    打赏 举报
  • dongyuan9892 2017-10-29 18:44

    Easy in XSLT 2.0/3.0. First reorganize the XML into a more sensible structure:

    <xsl:template match="book">
    <book>
      <xsl:for-each-group select="* except endpages" group-starting-with="page">
        <page n="{@n}">
          <xsl:copy-of select="current-group() except self::page"/>
        </page>
      </xsl:for-each-group>
    </book>
    </xsl:template>
    

    Then to process a selected page:

    <xsl:param name="page-num"/>
    <xsl:template match="page[@n = $page-num]">
      <xsl:apply-templates/>
    </xsl:template>
    

    You can run XSLT 2.0/3.0 from PHP using the Saxon/C processor. No need to dive into low-level DOM manipulation.

    评论
    解决 无用
    打赏 举报