drxvjnx58751 2014-01-22 02:52
浏览 88

替代正则表达式获取xml标记的内容

I'm processing a XML file and I need to get all content inside <section> tags.

Right now I'm using this regex:

<?php preg_match_all('/<section[^>]*>(.*?)<\/section>/i', $myXmlString, $results);?>

The code inside the <section> tags is pretty complex. It include math equations and stuff like that. In my local machine the regex works perfect. It is php 5.3.10 over apache 2.2.22 (Ubuntu)

BUT in my staging server it doesn't work. It is php 5.3.3 over apache 2.2.15 (Red Hat)

I would ask 2 questions:

Is there any issue with preg_match_all for php 5.3.3?

Is there a better way to express the regex?

--EDIT: VARIATIONS OF REGEX USED UNSUCCESSFULY--

<?php preg_match_all('/<section[^>]*>(.*?)<\/section>/is', $myXmlString, $results);?>
<?php preg_match_all('/<section[^>]*>(.*?)<\/section>/ims', $myXmlString, $results);?>
<?php preg_match_all('#<section[^>]*>(.*?)<\/section>#ims', $myXmlString, $results);?>
<?php preg_match_all('#<section[^>]*>([^\00]*?)<\/section>#ims', $myXmlString, $results);?>

--EDIT: Why haven't I used a parser?

The XML consists of two <sections>. Each section groups n questions for an exam.

Each question can include math equations represented by its own XML. An equation may be something like this:

<inlineequation><m:math baseline="-16.5" display="inline" overflow="scroll"><m:mrow><m:mtable columnalign="left"><m:mtr><m:mtd><m:mrow><m:mo stretchy="true">[</m:mo><m:mrow><m:mtable columnalign="right"><m:mtr><m:mtd><m:mn>4</m:mn></m:mtd><m:mtd columnalign="right"><m:mrow><m:mo>-</m:mo><m:mn>9</m:mn></m:mrow></m:mtd><m:mtd columnalign="right"><m:mrow><m:mn>54</m:mn></m:mrow></m:mtd></m:mtr><m:mtr><m:mtd columnalign="right"><m:mrow><m:mo>&minus;</m:mo><m:mn>28</m:mn></m:mrow></m:mtd><m:mtd columnalign="right"><m:mo>&minus;</m:mo><m:mn>1</m:mn></m:mtd><m:mtd columnalign="right"><m:mo>&minus;</m:mo><m:mn>14</m:mn></m:mtd></m:mtr></m:mtable></m:mrow><m:mo stretchy="true">]</m:mo></m:mrow></m:mtd></m:mtr></m:mtable></m:mrow></m:math></inlineequation>

I need that code to remain XML (no array) because I will pass that code as it is to a jQuery plugin which will render the equation (it will look like LaTeX equations).

If I parse the XML it will be really difficult to create the string for the equation again and locate it in the right place inside the question's statement.

  • 写回答

2条回答 默认 最新

  • doushi5752 2014-01-22 03:00
    关注

    regex can be resource intensive.

    perhaps consider using xml_parse_into_struct;

    <?php
        $xmlp = xml_parser_create();
        xml_parse_into_struct($xmlp, $myXmlString, $vals, $index);
        xml_parser_free($xmlp);
        print_r($vals);
    ?>
    
    评论

报告相同问题?

悬赏问题

  • ¥50 永磁型步进电机PID算法
  • ¥15 sqlite 附加(attach database)加密数据库时,返回26是什么原因呢?
  • ¥88 找成都本地经验丰富懂小程序开发的技术大咖
  • ¥15 如何处理复杂数据表格的除法运算
  • ¥15 如何用stc8h1k08的片子做485数据透传的功能?(关键词-串口)
  • ¥15 有兄弟姐妹会用word插图功能制作类似citespace的图片吗?
  • ¥200 uniapp长期运行卡死问题解决
  • ¥15 latex怎么处理论文引理引用参考文献
  • ¥15 请教:如何用postman调用本地虚拟机区块链接上的合约?
  • ¥15 为什么使用javacv转封装rtsp为rtmp时出现如下问题:[h264 @ 000000004faf7500]no frame?