dongsetan3216 2014-01-29 19:51
浏览 45

使用XMLReader搜索XML文件

I have a large XML file (around 20MB, but suppose to grow) and I have search form where I can enter searched keyword(s). I try to use DOMXPath::query to find what I need. Search is going through all nodes and if match is found it returns publication element (see XML below). It works fine but if I have 10 000 publications searching lasts 4 seconds which is very slow (I expect to have milions of publications).

XML file:

  <publication>
    <identificators>
      <identificator type="isbn">978-1-101-61439-8</identificator>
    </identificators>
    <title>Secured foreground capacity</title>
    <abstract>Illo dignissimos nulla libero ut ut. Inventore voluptas mollitia et officia. In quidem inventore voluptatem quas maxime. Et similique aliquam et sunt nulla.
Quae molestiae dolor architecto dicta non. Quia illo quia tempore architecto pariatur quo commodi cumque. Cumque nemo qui sunt.
Corporis quia reprehenderit modi neque architecto perferendis eligendi. Eveniet nobis illum totam possimus modi assumenda. Quia sed hic sit sequi. Doloremque temporibus eaque velit sed enim.</abstract>
    <dates>
      <date type="release">18.09.1995</date>
      <date type="added">17.07.1991</date>
    </dates>
    <language>Bajan</language>
    <release-number>2</release-number>
    <publisher>Kub PLC</publisher>
    <filepath/to/file.pdf</file>
    <type>Note</type>
    <categories>
      <category>Health Professions</category>
      <category>Computer Science</category>
      <category>Agricultural and Biological Sciences</category>
      <category>Chemical Engineering</category>
      <category>Materials Science</category>
    </categories>
    <keywords>
      <keyword>quia</keyword>
      <keyword>placeat</keyword>
    </keywords>
    <authors>
      <main-author>Nannie Klocko</main-author>
      <co-authors>
        <co-author>Name Surname</co-author>
      </co-authors>
    </authors>
    <affiliation>
      <name>Rippin, Stehr and Ryan</name>
      <type>Organisation</type>
      <address>
        <street>Rath Corner</street>
        <city>San Nicolás de los Garza</city>
        <country/>
      </address>
    </affiliation>
  </publication>

So I read about XMLReader how fast it is and so on but I found only examples how to read whole file and I need to know if I can use it to speed up my searching? If so can you provide some simple example?

Here is my XPath query now:

$xpath_query = "//publications/publication[contains(translate(., 'ABCDEFGHJIKLMNOPQRSTUVWXYZ', 'abcdefghjiklmnopqrstuvwxyz'), '$search_keyword')]";

Can I use something like this with XMLReader? Thank you very much for every hint.

  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥15 基于双目测规则物体尺寸
    • ¥15 wegame打不开英雄联盟
    • ¥15 公司的电脑,win10系统自带远程协助,访问家里个人电脑,提示出现内部错误,各种常规的设置都已经尝试,感觉公司对此功能进行了限制(我们是集团公司)
    • ¥15 救!ENVI5.6深度学习初始化模型报错怎么办?
    • ¥30 eclipse开启服务后,网页无法打开
    • ¥30 雷达辐射源信号参考模型
    • ¥15 html+css+js如何实现这样子的效果?
    • ¥15 STM32单片机自主设计
    • ¥15 如何在node.js中或者java中给wav格式的音频编码成sil格式呢
    • ¥15 不小心不正规的开发公司导致不给我们y码,