dongrang2140 2016-01-10 21:31
浏览 58

获取favicon的绝对路径并验证它是图像

I have a list of links that is dynamically populated and I would like each link to have its favicon displayed next to it. Thanks to some others here I'm slightly closer to getting this working but the issues I'm still having are that since the code below is pulling whatever it finds that matches the search, in some cases it gets the relative path of the favicon and in other cases, as I've mind-numbingly discovered, it's getting something completely random, like a script that completely messes up my page and sends me on a wild goose chase through my error logs.

So my question is: How can I get the absolute path of the favicon, and if I can modify the code below to do that, then how can I also verify that whatever it's finding in the search is actually an image?

The function that searches for the favicon link:

// Get favicon of link
function get_favicon($url) {
    $doc = new DOMDocument();
    $doc->strictErrorChecking = FALSE;
    $doc->loadHTML(file_get_contents($url));
    $xml = simplexml_import_dom($doc);
    $arr = $xml->xpath('//link[@rel="shortcut icon"]');
    $favicon = $arr[0]['href'];
    return $favicon;        
}

The code from the display page, which pulls the URL from an array inside a foreach loop:

$ll_favicon = get_favicon(esc_attr( $ll_entry['ll_url'] ));
echo '<img src="'.$ll_favicon.'" />';

Thanks for your help!


UPDATE:

Good grief, this is crazy. I've been researching this all day and still no luck.

At the moment it works (but takes forever to load) exactly 50% of the time. Yes, 50%. It will load after a while and work perfectly, and then I will refresh the page without changing a single thing and it'll break again (the page only loads up to the same link), and then I'll refresh and it'll work, and then refresh again and it won't, and now I'm incredibly frustrated and I'm giving up. It always breaks just before the same link: http://www.santafenewmexican.com/pasatiempo/

It's just that link. I've even tried it like this and it breaks the page wherever I put it, yet works fine with any other URL:

<?php echo get_favicon('http://www.santafenewmexican.com/pasatiempo/'); ?>

Here's the function I ended up with. If anyone wants to see if they can get this working, by all means please have at it:

// Get favicon of link
function get_favicon($url) {
    $doc = new DOMDocument();
    $doc->strictErrorChecking = FALSE;
    $doc->loadHTML(file_get_contents($url));
    $xml = simplexml_import_dom($doc);
    $arr = $xml->xpath('//link[@rel="shortcut icon"]');
    $favicon = $arr[0]['href'];
    if( !empty($favicon) ) {

        // Verify that the URL is the absolute path:
        if(strpos($favicon,'http') !== 0  && strpos($favicon,'//') !== 0 && strpos($favicon,'://') !== 0)
                $favicon = rtrim($url,'/') . $favicon;

        // Verify that the file found actually exists and if so, whether it's an image:
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL,$favicon);
        curl_setopt($ch, CURLOPT_NOBODY, 1);
        curl_setopt($ch, CURLOPT_FAILONERROR, 1);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        if(curl_exec($ch)!==FALSE)
        {
            if(function_exists('finfo_fopen')) {
                // for PHP 4:
                $fhandle = finfo_open(FILEINFO_MIME);
                $mime_type = finfo_file($fhandle,$favicon);
                // for PHP 5:
                $file_info = new finfo(FILEINFO_MIME);
                $mime_type = $file_info->buffer(file_get_contents($favicon));

                switch($mime_type) {
                    case ('image/x-icon'||'image/icon'||'image/vnd.microsoft.icon'||'image/gif'||'image/jpeg'||'image/png'||'image/vnd.sealed-png'||'image/vnd.sealedmedia.softseal-gif'||'image/vnd.sealedmedia.softseal-jpg'):
                        return $favicon;
                }
            } elseif(function_exists('exif_imagetype')) {
               if (exif_imagetype($favicon) != (IMAGETYPE_GIF || IMAGETYPE_JPEG || IMAGETYPE_PNG || IMAGETYPE_ICO)) {
                    return 'No, exif_imagetype did not return a valid mime.';
                } else {
                    return $favicon;
                }
            } elseif(function_exists('getimagesize')) {

                $imginfo_array = getimagesize($favicon);

                if ($imginfo_array !== false) {
                    $mime_type = $imginfo_array['mime'];
                    switch($mime_type) { 

                    case ('image/x-icon'||'image/icon'||'image/vnd.microsoft.icon'||'image/gif'||'image/jpeg'||'image/png'||'image/vnd.sealed-png'||'image/vnd.sealedmedia.softseal-gif'||'image/vnd.sealedmedia.softseal-jpg'):
                        return $favicon;
                    }
                } else {
                    return 'No, getimagesize did not return a valid mime.';
                }
            }
        } else {
            return 'This file does not exist!';
        }
    } else {
        return 'No favicon was found.';
    }
}
  • 写回答

1条回答 默认 最新

  • doutangdan3588 2016-01-10 21:35
    关注

    You should check href and see it's absolute or not.

    Here is an example

    $favicon = $arr[0]['href'];
    if(strpos($favicon,'http') !== 0  && strpos($favicon,'//') !== 0 && strpos($favicon,'://') !== 0)
        $favicon = rtrim($url,'/') . '/' . $favicon;
    return $favicon;  
    
    评论

报告相同问题?

悬赏问题

  • ¥15 对于这个问题的解释说明
  • ¥200 询问:python实现大地主题正反算的程序设计,有偿
  • ¥15 smptlib使用465端口发送邮件失败
  • ¥200 总是报错,能帮助用python实现程序实现高斯正反算吗?有偿
  • ¥15 对于squad数据集的基于bert模型的微调
  • ¥15 为什么我运行这个网络会出现以下报错?CRNN神经网络
  • ¥20 steam下载游戏占用内存
  • ¥15 CST保存项目时失败
  • ¥15 树莓派5怎么用camera module 3啊
  • ¥20 java在应用程序里获取不到扬声器设备