doumeba0486 2013-11-04 11:08
浏览 30
已采纳

用PHP抓取Alexa信息

I'm trying to receive information about Alexa Top Sites from a Contrie, and i would like to receive:

  • Website Position;
  • Website URL;

For the URL i'm getting already, but when i add tag for website position something isnt working, here's my code:

<?php

for ($z=0;$z<2;$z++) {
$html=file_get_contents('http://www.alexa.com/topsites/countries;'.$z.'/PT');
preg_match_all(
    '/<div class="count">.*?<\/div>.*?<a href="\/siteinfo\/.*?">(.*?)<\/a>/s',
    $html,
    $array, //array with sites
    PREG_SET_ORDER
);

for ($i=1;$i<count($array);$i++) {
    echo "<pre>"; print_r($array); echo "</pre>"; 
}
} 


?>

I'm getting this:

Array
(
[0] => Array
    (
        [0] => 
1



google.pt
        [1] => google.pt
    )

[1] => Array
    (
        [0] => 
2
  • 写回答

4条回答 默认 最新

  • ds342222222 2013-11-04 11:13
    关注

    Why not use the official API?

    It costs $0.15 for 1,000 requests, and you''ll get nice XML readble by SimpleXML. As bonus - you won't violate the alexa terms of usage.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(3条)

报告相同问题?

悬赏问题

  • ¥15 Python程序,深度学习,有偿私
  • ¥15 扫描枪扫条形码出现问题
  • ¥15 poi合并多个word成一个新word,原word中横版没了.
  • ¥15 【火车头采集器】搜狐娱乐这种列表页网址,怎么采集?
  • ¥15 求MCSCANX 帮助
  • ¥15 机器学习训练相关模型
  • ¥15 Todesk 远程写代码 anaconda jupyter python3
  • ¥15 我的R语言提示去除连锁不平衡时clump_data报错,图片以下所示,卡了好几天了,苦恼不知道如何解决,有人帮我看看怎么解决吗?
  • ¥20 关于URL获取的参数,无法执行二选一查询
  • ¥15 液位控制,当液位超过高限时常开触点59闭合,直到液位低于低限时,断开