dongyuan1160 2010-12-24 23:22
浏览 80
已采纳

创建DOMDocument:匹配PHP解析器中的某个元素

good evening dear Community,

Well first of all: felize Navidad - I wanna wish you a Merry Christmas!! In my season-break i am workin on a little parser-script.

Today i'm trying to debug a little DOMDocument object in php. Ideally it'd be nice if I could get DOMDocument to output in a array-like format, to store the data in a database!

My example: head over to the url - see the example: the target

I want to filter out the data in the block:

Schulart: BBS
Schulnummer:60119
Anschrift: Berufsbildende Schule Boppard Antoniusstr. 21; 56154 Boppard
Telefon: (0 67 42) 80 61-0
Telefax: (0 67 42) 80 61-29
E-Mail: sekretary@bbs-boppard.de
Internet: website 
Träger:Kreisverwaltung Rhein-Hunsr�ck-Kreis
letzte Änderung: 08 Feb 2010 14:33:12 von 60119

I have investigated the sourcecode - and found out that the attribute of interest should be this one: class="content"div class="content"><!-- TYPO3SEARCH_begin --> or even better: wfqbeResults

So if i run the DOMDucument way i can use this like so:

$dom->getElementById('wfqbeResults');

here the code is: - my trails

<?php

$dom = new DOMDocument();
@$dom->loadHTMLFile(' -> here the website goes in<- ');
$divElement = $dom->getElementById('wfqbeResults');

$innerHTML= '';
$children = $divElement->childNodes;
foreach ($children as $child) {
   $innerHTML .= $child->ownerDocument->saveXML( $child );
} 
echo $innerHTML;

<?

Duhh: this outputs lot of garbage. The code spits out a lot of html anyway. I have to overhaul the code a bit to get the wanted 9 lines out of the parser:

what is aimed: i want to get out the following:

a. 9 lines with nine labels and nine values. b. I want to prepare the output to store it in a MySQL-DB!

Look forward to some hints greetings zero

  • 写回答

1条回答 默认 最新

  • duan19750503 2010-12-25 05:34
    关注

    Here is the solution return the labels and values in a formatted array ready for input to mysql!

    <?php
    
    $dom = new DOMDocument();
    @$dom->loadHTMLFile('http://schulen.bildung-rp.de/gehezu/startseite/einzelanzeige.html?tx_wfqbe_pi1%5buid%5d=60119');
    $divElement = $dom->getElementById('wfqbeResults');
    
    $innerHTML= '';
    $children = $divElement->childNodes;
    foreach ($children as $child) {
    $innerHTML = $child->ownerDocument->saveXML( $child );
    
    $doc = new DOMDocument();
    $doc->loadHTML($innerHTML);
    //$divElementNew = $dom->getElementsByTagName('td');
    $divElementNew = $dom->getElementsByTagname('td');
    
        /*** the array to return ***/
        $out = array();
        foreach ($divElementNew as $item)
        {
            /*** add node value to the out array ***/
            $out[] = $item->nodeValue;
        }
    
    echo '<pre>';
    print_r($out);
    echo '</pre>';
    
    } 
    
    ?>
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 乌班图ip地址配置及远程SSH
  • ¥15 怎么让点阵屏显示静态爱心,用keiluVision5写出让点阵屏显示静态爱心的代码,越快越好
  • ¥15 PSPICE制作一个加法器
  • ¥15 javaweb项目无法正常跳转
  • ¥15 VMBox虚拟机无法访问
  • ¥15 skd显示找不到头文件
  • ¥15 机器视觉中图片中长度与真实长度的关系
  • ¥15 fastreport table 怎么只让每页的最下面和最顶部有横线
  • ¥15 java 的protected权限 ,问题在注释里
  • ¥15 这个是哪里有问题啊?