dqluw20882 2014-12-15 16:03
浏览 45
已采纳

xpath-> query()仅适用于星号

Below is the code I'm currently working with.

The input XML file is available here: http://pastebin.com/hcQhPSjs

header("Content-Type: text/plain");
  $xmlFile = new domdocument();
  $xmlFile->preserveWhiteSpace = false;
  $xmlFile->load("file:///srv/http/nginx/html/xml/UNSD_Quest_Sample.xml");
  $xpath = new domxpath($xmlFile);
  $hier = '//Workbook';
  $result = $xpath->query($hier);
  foreach ($result as $element) {
    print $element->nodeValue;
    print "
";
  };

Now for the $hier variable, PHP won't parse the results unless I use the wildcard * to reach the nodes I need. So instead of using the usual /Workbook/Worksheet/Table/Row/Cell/Data method of accessing nodes, I'm relegated to /*/*[6]/*[2]/* The input file is an excel spreadsheet exported to xml. Seems like the issue might be in the export from xls to xml.

What I find peculiar is the fact that Firefox (default browser) does not parse the namespace attributes for the root element <Workbook> while Chromium and/or any text editor do.
Firefox:

<?mso-application progid="Excel.Sheet"?>
<Workbook>
<DocumentProperties>
<Author>Htike Htike Kyaw Soe</Author>
<Created>2014-01-14T20:37:41Z</Created>
<LastSaved>2014-12-04T10:05:11Z</LastSaved>
<Version>14.00</Version>
</DocumentProperties>
<OfficeDocumentSettings>
<AllowPNG/>
</OfficeDocumentSettings>

Chromium:

<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
<Author>Htike Htike Kyaw Soe</Author>
<Created>2014-01-14T20:37:41Z</Created>
<LastSaved>2014-12-04T10:05:11Z</LastSaved>
<Version>14.00</Version>
</DocumentProperties>
<OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
<AllowPNG/>
</OfficeDocumentSettings>  

Could anyone explain why this is the case?

  • 写回答

1条回答 默认 最新

  • 普通网友 2014-12-15 16:18
    关注

    You need to register and use a namespace prefix for the namespace used in the XML. From the tag and element names I expect it to be urn:schemas-microsoft-com:office:spreadsheet - Excel Spreadsheet. So here is an example for that:

    $xml = <<<'XML'
    <?xml version="1.0"?>
    <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet">
      <Worksheet>
        <Table>
          <Row>
            <Cell>
              <Data>TEST</Data>
            </Cell>
          </Row>
        </Table>
      </Worksheet>
    </Workbook>
    XML;
    
    $dom = new DOMDocument();
    $dom->preserveWhiteSpace = false;
    $dom->loadXML($xml);
    $xpath = new DOMXpath($dom);
    $xpath->registerNamespace('s', 'urn:schemas-microsoft-com:office:spreadsheet');
    
    $expression = '/s:Workbook/s:Worksheet/s:Table/s:Row/s:Cell/s:Data';
    $result = $xpath->evaluate($expression);
    foreach ($result as $element) {
      print $element->nodeValue;
      print "
    ";
    }
    

    Output:

    TEST
    

    You should not use DOMXpath::query() but DOMXpath::evaluate(). It allows you to fetch scalar values using XPath, too.

    $expression = 'string(/s:Workbook/s:Worksheet/s:Table/s:Row/s:Cell/s:Data)';
    echo $xpath->evaluate($expression);
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 求指导ADS低噪放设计
  • ¥15 CARSIM前车变道设置
  • ¥50 三种调度算法报错 有实例
  • ¥15 关于#python#的问题,请各位专家解答!
  • ¥200 询问:python实现大地主题正反算的程序设计,有偿
  • ¥15 smptlib使用465端口发送邮件失败
  • ¥200 总是报错,能帮助用python实现程序实现高斯正反算吗?有偿
  • ¥15 对于squad数据集的基于bert模型的微调
  • ¥15 为什么我运行这个网络会出现以下报错?CRNN神经网络
  • ¥20 steam下载游戏占用内存