dongrong9053 2012-12-10 00:51
浏览 33
已采纳

PHP Array在robots.txt中解析站点地图

I am having some issues trying to get the url data from an array using php.

My code is and im trying to get each sitemap mentioned in robots.txt file

$robots_file = file_get_contents($robotsTXT);
$pattern = "/Sitemap: ([^
]*)/";
$i = preg_match_all($pattern, $robots_file, $match, PREG_SET_ORDER);

print_r($match);

print_r($match); returns below

Array ( 
    [0] => Array ( [0] => Sitemap: http://www.google.com/culturalinstitute/sitemap.xml 
    [1] => http://www.google.com/culturalinstitute/sitemap.xml ) 
    [1] => Array ( [0] => Sitemap: http://www.google.com/hostednews/sitemap_index.xml 
    [1] => http://www.google.com/hostednews/sitemap_index.xml ) 
    [2] => Array ( [0] => Sitemap: http://www.google.com/sitemaps_webmasters.xml 
    [1] => http://www.google.com/sitemaps_webmasters.xml ) 
    [3] => Array ( [0] => Sitemap: http://www.google.com/ventures/sitemap_ventures.xml 
    [1] => http://www.google.com/ventures/sitemap_ventures.xml ) 
    [4] => Array ( [0] => Sitemap: http://www.gstatic.com/dictionary/static/sitemaps/sitemap_index.xml [1] => http://www.gstatic.com/dictionary/static/sitemaps/sitemap_index.xml ) 
    [5] => Array ( [0] => Sitemap: http://www.gstatic.com/earth/gallery/sitemaps/sitemap.xml 
    [1] => http://www.gstatic.com/earth/gallery/sitemaps/sitemap.xml ) 
    [6] => Array ( [0] => Sitemap: http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml 
    [1] => http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml ) 
    [7] => Array ( [0] => Sitemap: http://www.gstatic.com/trends/websites/sitemaps/sitemapindex.xml 
    [1] => http://www.gstatic.com/trends/websites/sitemaps/sitemapindex.xml )
) 

What i want to do is display the address like so

http://www.google.com/culturalinstitute/sitemap.xml
http://www.google.com/hostednews/sitemap_index.xml
http://www.google.com/sitemaps_webmasters.xml 
http://www.google.com/ventures/sitemap_ventures.xml
http://www.gstatic.com/dictionary/static/sitemaps/sitemap_index.xml
http://www.gstatic.com/earth/gallery/sitemaps/sitemap.xml 
http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml
http://www.gstatic.com/trends/websites/sitemaps/sitemapindex.xml

i tried writing a for each loop but i could not get it to work.

foreach( $match as $sitemap){

echo $sitemap[1];

}

Any help would be appreciated

  • 写回答

2条回答 默认 最新

  • dsirr48088 2012-12-10 01:02
    关注
    $robots_file = file_get_contents($robotsTXT);
    
    $pattern = '/Sitemap: ([^\s]+)/';
    preg_match_all($pattern, $robots_file, $match);
    
    print_r($match[1]);
    
    foreach ($match[1] as $sitemap)
    {
        echo $sitemap . "<br />
    ";
    }
    

    You dont need to loop through entire matched array, just need to loop through the array which is $match[1].

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥100 关于使用MATLAB中copularnd函数的问题
  • ¥20 在虚拟机的pycharm上
  • ¥15 jupyterthemes 设置完毕后没有效果
  • ¥15 matlab图像高斯低通滤波
  • ¥15 针对曲面部件的制孔路径规划,大家有什么思路吗
  • ¥15 钢筋实图交点识别,机器视觉代码
  • ¥15 如何在Linux系统中,但是在window系统上idea里面可以正常运行?(相关搜索:jar包)
  • ¥50 400g qsfp 光模块iphy方案
  • ¥15 两块ADC0804用proteus仿真时,出现异常
  • ¥15 关于风控系统,如何去选择