duansai1314 2014-06-29 13:24
浏览 61
已采纳

解析HTML页面

I have a slight problem at parsing one HTML page.

Here is script i have made so far.

<?php

include('simple_html_dom.php');

$resoults = array();

$URL = "http://www.ajpes.si/eobjave/rezultati.asp?podrobno=0&id_skupina=51&TipDolznika=-1&TipPostopka=-1&id_SkupinaVrsta=-1&id_skupinaPodVrsta=-1&Dolznik=&Oblika=&MS&DS=&StStevilka=&Sodisce=-1&DatumDejanja_od=&DatumDejanja_do=&sys_ZacetekObjave_od=26.6.2014&sys_ZacetekObjave_do=26.6.2014&MAXREC=7000&mdres=3";

getResoults($URL);

function getResoults($URL) 
{

     global $resoults;

     $html = new simple_html_dom();
     $html->preserveWhiteSpace = false; 

     $html->load_file($URL);

     $items = $html->find("td.tabData a");  


    foreach($items as $key => $post) 
     {
         $resoults[][] = array($post->plaintext);     
     }

        $html->clear(); 
        unset($html);

        print_r(array_values($resoults[1]));
        print_r(array_values($resoults[2]));
        print_r(array_values($resoults[3]));    
        print_r(array_values($resoults[4]));
        print_r(array_values($resoults[5]));
        print_r(array_values($resoults[6]));

}

?>

What i currently get is a array of results ranging from 1 ... XX

What i need is a two dimensional array ... first dimension will be at which <tr></tr> block i am:

and second array will store values of each result i find:

So for example values should be stored like this:

Array[1][0] 
Array[1][1] 
Array[1][2] 

And for next node:

Array[2][0] 
Array[2][1] 
Array[2][2] 

If someone could help me that would be great!

Thank you all for your time!

  • 写回答

1条回答 默认 最新

  • doucheng4094 2014-06-29 13:51
    关注

    You would need to find all tr's first, then go to all td's in that tr.

    Replace this:

    $items = $html->find("td.tabData a");  
    
    foreach($items as $key => $post) 
     {
         $resoults[][] = array($post->plaintext);     
     }
    

    With this example code:

    $tableRows = $html->find("tr");
    
    foreach ($tableRows as $rowKey => $rowValue)
    {
        static $i = 0;
    
        $tds = $rowValue->find("td.tabData a");
    
        // tr does not have any td's of that class
        if (count($tds) == 0)
            continue;
    
        foreach($tds as $key => $post) 
        {
            $resoults[$i][] = $post->plaintext; 
        }  
        $i++;
    }
    

    The complete $resoults array now looks like this:

    Array (

    [0] => Array
        (
            [0] => postopek osebnega stečaja
            [1] => Borka Bolić
            [2] => ni vpisa
            [3] => ni javna
            [4] => 2084/2013
            [5] => 24.6.2014
        )
    
    [1] => Array
        (
            [0] => stečajni postopek nad pravno osebo
            [1] => PROTOCOL, protokolarni prevozi in poslovne storitve, d.o.o. - v stečaju
            [2] => 5670829000
            [3] => 28282345
            [4] => 2523/2013
            [5] => 4.6.2014
        )
    
    [2] => Array
        (
            [0] => stečajni postopek nad pravno osebo
            [1] => Lira skupina, trgovina in storitve d.o.o.
            [2] => 5462975000
            [3] => 20767285
            [4] => 2328/2013
            [5] => 31.5.2014
        )
    

    )

    Is that what you are looking for?

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 关于#matlab#的问题:在模糊控制器中选出线路信息,在simulink中根据线路信息生成速度时间目标曲线(初速度为20m/s,15秒后减为0的速度时间图像)我想问线路信息是什么
  • ¥15 banner广告展示设置多少时间不怎么会消耗用户价值
  • ¥16 mybatis的代理对象无法通过@Autowired装填
  • ¥15 可见光定位matlab仿真
  • ¥15 arduino 四自由度机械臂
  • ¥15 wordpress 产品图片 GIF 没法显示
  • ¥15 求三国群英传pl国战时间的修改方法
  • ¥15 matlab代码代写,需写出详细代码,代价私
  • ¥15 ROS系统搭建请教(跨境电商用途)
  • ¥15 AIC3204的示例代码有吗,想用AIC3204测量血氧,找不到相关的代码。