dongyuan1160 2010-12-24 23:22
浏览 80
已采纳

创建DOMDocument:匹配PHP解析器中的某个元素

good evening dear Community,

Well first of all: felize Navidad - I wanna wish you a Merry Christmas!! In my season-break i am workin on a little parser-script.

Today i'm trying to debug a little DOMDocument object in php. Ideally it'd be nice if I could get DOMDocument to output in a array-like format, to store the data in a database!

My example: head over to the url - see the example: the target

I want to filter out the data in the block:

Schulart: BBS
Schulnummer:60119
Anschrift: Berufsbildende Schule Boppard Antoniusstr. 21; 56154 Boppard
Telefon: (0 67 42) 80 61-0
Telefax: (0 67 42) 80 61-29
E-Mail: sekretary@bbs-boppard.de
Internet: website 
Träger:Kreisverwaltung Rhein-Hunsr�ck-Kreis
letzte Änderung: 08 Feb 2010 14:33:12 von 60119

I have investigated the sourcecode - and found out that the attribute of interest should be this one: class="content"div class="content"><!-- TYPO3SEARCH_begin --> or even better: wfqbeResults

So if i run the DOMDucument way i can use this like so:

$dom->getElementById('wfqbeResults');

here the code is: - my trails

<?php

$dom = new DOMDocument();
@$dom->loadHTMLFile(' -> here the website goes in<- ');
$divElement = $dom->getElementById('wfqbeResults');

$innerHTML= '';
$children = $divElement->childNodes;
foreach ($children as $child) {
   $innerHTML .= $child->ownerDocument->saveXML( $child );
} 
echo $innerHTML;

<?

Duhh: this outputs lot of garbage. The code spits out a lot of html anyway. I have to overhaul the code a bit to get the wanted 9 lines out of the parser:

what is aimed: i want to get out the following:

a. 9 lines with nine labels and nine values. b. I want to prepare the output to store it in a MySQL-DB!

Look forward to some hints greetings zero

  • 写回答

1条回答 默认 最新

  • duan19750503 2010-12-25 05:34
    关注

    Here is the solution return the labels and values in a formatted array ready for input to mysql!

    <?php
    
    $dom = new DOMDocument();
    @$dom->loadHTMLFile('http://schulen.bildung-rp.de/gehezu/startseite/einzelanzeige.html?tx_wfqbe_pi1%5buid%5d=60119');
    $divElement = $dom->getElementById('wfqbeResults');
    
    $innerHTML= '';
    $children = $divElement->childNodes;
    foreach ($children as $child) {
    $innerHTML = $child->ownerDocument->saveXML( $child );
    
    $doc = new DOMDocument();
    $doc->loadHTML($innerHTML);
    //$divElementNew = $dom->getElementsByTagName('td');
    $divElementNew = $dom->getElementsByTagname('td');
    
        /*** the array to return ***/
        $out = array();
        foreach ($divElementNew as $item)
        {
            /*** add node value to the out array ***/
            $out[] = $item->nodeValue;
        }
    
    echo '<pre>';
    print_r($out);
    echo '</pre>';
    
    } 
    
    ?>
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 matlab实现基于主成分变换的图像融合。
  • ¥15 对于相关问题的求解与代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 信号傅里叶变换在matlab上遇到的小问题请求帮助
  • ¥15 保护模式-系统加载-段寄存器
  • ¥15 电脑桌面设定一个区域禁止鼠标操作
  • ¥15 求NPF226060磁芯的详细资料
  • ¥15 使用R语言marginaleffects包进行边际效应图绘制
  • ¥20 usb设备兼容性问题
  • ¥15 错误(10048): “调用exui内部功能”库命令的参数“参数4”不能接受空数据。怎么解决啊